retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.31k stars 285 forks source link

Feature request: auto-export merging, or synchronization with `.bib` file #1755

Closed tobiasBora closed 3 years ago

tobiasBora commented 3 years ago

Hello,

When I work with co-authors on a given project, we usually have one ref.bib file that contains all our references, say something like:

@incollection{refA, ...}
@incollection{refB, ...}

I would like now to be able to automatically add in that file one of my collection from Zotero. One solution is to export it, say in file zot.bib containing at the end something like:

@incollection{zotRefA, ...}
@incollection{zotRefB, ...}

and manually copy it at the end of ref.bib to produce:

@incollection{refA, ...}
@incollection{refB, ...}
@incollection{zotRefA, ...}
@incollection{zotRefB, ...}

But if I add new references in my collection, or if I edit existing entries in Zotero, I will need to manually see which entries changed or was added, and manually report the changes in the ref.bib file... A bit hard to maintain (it is basically what I do right now, where I just copy entries between Zotero and ref.bib). There is also this great auto-export feature, that automatically updates the zot.bib file when an entry changes. But again, the file ref.bib won't be changed and I need to manually move any change from zot.bib into ref.bib. And if I ask to Zotero to auto-export directly in ref.bib it will be even worse: all the existing entries like refA, refB, or any new entry added by my coauthors in that file will be removed automatically.

So would it make sense to create a kind of auto-export merging, in which the destination file is not cleared, but is merged with the Zotero collection? That way, my co-authors can edit that file, and I can use Zotero to quickly add entries into the file.

I guess the only tricky part of the implementation is in the presence of conflicts, notably when ref.bib already contains an entry zotRefA, or if a co-author edits the entry zotRefA previously added by Zotero. We can imagine different ways to merge the ref.bib file and the zot.bib file:

  1. ref.bib has the priority: if an entry is present in both files, the ref.bib version is kept. The only problem of this solution is that if an entry is updated in Zotero, it will not be updated in ref.bib.
  2. zot.bib has the priority: if an entry is present in both files, the zot.bib version is kept. I guess this version is a bit dangerous because it may overwrite co-authors changes. Also, as pointed out by retorque, it can be emulated by including both ref.bib and rot.bib in the .tex.
  3. ask to the user on a per-entry basis which entry to choose. Then, zotero could keep in memory/in a file located next to the .bib the decision of the user (eventually with the content/hash of the entries to detect later changes), to ensure that we don't annoy the user at every merge operation. This operation seems to be the best of both world, but it may be slightly harder to implement.

Note that it is also possible to imagine a two-way sync: when an entry is added/modified in the .bib file, we could imagine to report the changes back into the Zotero collection... But I think this operation is a bit more risky as it has the risk of polluting the Zotero collection of the user with many, less controlled, entries. And if a user just fixes a typo in an entry, the solution 3 would, I guess, be enough to inform the user of that typo (we can also imagine to add a button "accept distant change, and update Zotero's entry" for that specific case... but it's again a bit more complicated to implement I guess).

retorquere commented 3 years ago

Two-way sync is not possible, at least not without a lot of extra history registration. If I find @xyz in the Zotero library, but not in the bib file, that can equally likely be the scenario "added to the Zotero library, add to bib", or "removed from bib, remove from Zotero library".

Why not just have multiple \bibliography calls in your document source? Some for auto-exported bibs, others for manually maintained bibs. And why not work in one Zotero group library with your co-authors, and just use the exports rather than manually maintaining bibfiles?

retorquere commented 3 years ago

closing for lack of activity

tobiasBora commented 3 years ago

Sorry for the delayed answer. But yes, we definitely need an extra history to keep in memory. I understand that it can be too heavy to implement.

Having two .bib only partly solves the problem: if a user decides to change stuff in the file managed by Zotero, the changes will be lost. And Zotero's group are a bit hard to setup since it requires action from all of my co-authors (my co-authors need to also use Zotero, we need to find an easy way to sync it with overleaf given that thay don't use git…).

I guess manual copy/paste is still the best solution if there is no plan to support an external history file. Thanks !

retorquere commented 3 years ago

Sorry for the delayed answer. But yes, we definitely need an extra history to keep in memory. I understand that it can be too heavy to implement.

And this memory would have to be maintained by the users, manually.

Having two .bib only partly solves the problem: if a user decides to change stuff in the file managed by Zotero, the changes will be lost.

Don't do that then? In the setup I have in mind, one file would be only updated by BBT, and another would only be updated by humans. If you think humans can be taught to update a manual change-log, surely they can be taught not to edit the auto-generated file. You could have the file name itself warn against this as an extra measure.

And Zotero's group are a bit hard to setup since it requires action from all of my co-authors (my co-authors need to also use Zotero, we need to find an easy way to sync it with overleaf given that thay don't use git…).

Overleaf has Dropbox support. And they don't have to use Zotero, really, they just must not touch the generated file. I haven't heard an argument against this setup yet.

Oh wait, you'd want to prevent that you added an entry to Zotero that was already present in the comunally maintained bib file? That would rely on getting the citekey exactly the same. This is going to be a quagmire. If you keep the communal bib file under version control, you could generate a diff into a bib file and import that periodically to keep in "sync". But if they don't use git, that's pretty much out then.

I guess manual copy/paste is still the best solution if there is no plan to support an external history file. Thanks !

The external history file wouldn't really lower the maintenance burden because you'd all now have to remember to note down 2021-03-08 16:02 added @jhacssde so BBT could compare. This would have to be done adhering to a fairly strict format too so it can be used for processing.

tobiasBora commented 3 years ago

And this memory would have to be maintained by the users, manually.

Well, Zotero could do most of the hard work to maintain the file, and only ask a few questions to the user when required. Basically, in case of a conflict (like an entry managed by Zotero is changed), Zotero can simply print a message to the user "which entry do you want to keep"? Therefore, the job of the user is basically to click on a few yes/no buttons.

Overleaf has Dropbox support. And they don't have to use Zotero, really, they just must not touch the generated file. I haven't heard an argument against this setup yet. Oh wait, you'd want to prevent that you added an entry to Zotero that was already present in the comunally maintained bib file?

Yes, the problem arrives when a co-author wants to edit an entry from the Zotero file. If the author does not change the Zotero's file, then they must duplicate the entry and change all \cite{...} occurences in the .tex file... Highly not practical.

This would have to be done adhering to a fairly strict format too so it can be used for processing.

The external history file wouldn't really lower the maintenance burden because you'd all now have to remember to note down ...

retorquere commented 3 years ago

Well, Zotero could do most of the hard work to maintain the file, and only ask a few questions to the user when required. Basically, in case of a conflict (like an entry managed by Zotero is changed), Zotero can simply print a message to the user "which entry do you want to keep"? Therefore, the job of the user is basically to click on a few yes/no buttons. Yes, the problem arrives when a co-author wants to edit an entry from the Zotero file. If the author does not change the Zotero's file, then they must duplicate the entry and change all \cite{...} occurences in the .tex file... Highly not practical.

So the request is really whether I want to add a second backend type to the Zotero sync mechanism, but one taking a bibtex file as the "database". It's not so much Zotero doing the hard work, it'd be me. That is really a lot more complicated than I want to take on, especially because it would involve a fair bit of UI work -- I minimize UI work, as I find it finicky and unrewarding. And this kind of sync must also not just detect new/disappeared items -- it must also test for changes to items, and there's no simple way to check whether a Zotero items has changed vs a bibtex item, as the transformation is non-trivial, and not always loss-free. This all sounds incredibly fragile.

If there was already a solid history on the bib file, maybe something could be done, but if your co-authors also do not use git, that seems to be a dead end.