wshanks / lyz

LyZ is a plugin for Zotero, which is intended to make working with LyX/Zotero more pleasant.
GNU General Public License v3.0
108 stars 13 forks source link

Add function to check the integrity of lyz.sqlite #7

Open wshanks opened 10 years ago

wshanks commented 10 years ago

Issues #4 and #6 were both likely the result of empty entries in LyZ's database of bibliography data. It might be difficult to track down the root cause of this problem, but if it seems like the problem is prevalent enough it might be worth adding a function to LyZ to check for invalid database entries and to allow the user to fix them in some way without having to edit the SQLite file directly.

It will likely be a while before I have time to work on this so I am labeling it as "Help Welcome" in case anyone else is interested in trying to do this.

auxym commented 10 years ago

Hi, I'm currently experiencing the issue. When I click "Update BibTex", I get the following error in firefox browser console.

Parameter 0 is undefined in Zotero.DB.getStatement() [QUERY: SELECT ROWID FROM items WHERE libraryID IS NULL AND key=?] db.js:324

uncaught exception: Parameter 0 is undefined in Zotero.DB.getStatement() [QUERY: SELECT ROWID FROM items WHERE libraryID IS NULL AND key=?]

I had a look in the lyz.sqlite file and saw no empty entries. Oops, I looked to quick. See mangled and empty entries in http://imgur.com/16a5mbs

My hypothesis is that Lyz read the first line of the .bib and parsed the first line as keys. However, something (me, oops) had hand edited the file and accidentally replaced the first line with a @string definition. Hope that helps with tracking down the bug.

wshanks commented 10 years ago

Well, I'm glad this looks like the same issue instead of a new one.

Do you still have a backup of the .bib file that was causing the problem? It would be good to test your theory by editing the .bib file and seeing if LyZ then works as expected. If not, we can recreate what you did with a dummy file. What did you add to the .bib file?

You are right that LyZ starts off "Update BibTeX" by reading in the existing .bib file. I have to study the code more closely (I took over maintaining LyZ about two years ago but I'm not the original author), but it's not clear to me right now why it does this by default. The README states that the .bib file should not be edited directly, so I don't see why LyZ doesn't just rewrite the .bib file from scratch based on sqlite file it maintains. I can see how some people would want to edit the .bib file by hand and not lose their changes, but I think that feature should probably be a separate command and not the default behavior given that the README says not to edit the .bib file.

auxym commented 10 years ago

It seems that lyz writes the first line of the bibtex file as a sequence of whitespace-separated strings of the form [zotero library id]_[zotero item id]. I guess lyz depends on this first line to determine which references are used in the document, and then builds its DB from that.

Here is the procedure that I believe led to a broken lyz db.

  1. Create a lyz document and insert citations using lyz.
  2. Open the .bib file created by lyz and tamper with the first line that has the reference IDs. Either delete it or replace it, for example by a bibtex @comment line.
  3. Hit the bibtex update button in lyz. When you do this, it seems lyz attempts to parse the first line and inserts mangled references in its db based on it.

A solution might be to check the magic refID line with a simple regex before parsing it into the db.

wshanks commented 10 years ago

Yes, I agree that better parsing and error handling is called for. LyZ should at least recognize that the .bib file has been modified in an unexpected way, stop trying to import the file into its database, and alert the user to an issue with the database.

I still want to think more about LyZ reading in the BibTeX file. I can see how it might be desirable, but it seems cleaner to allow just Zotero->BibTeX by default and not try to go both ways. For most users, it should be possible to make any necessary changes in Zotero and then write out the .bib file with the "Update BibTeX" function. Reading the .bib file back in seems like it could cause as many issues as it solves, so I would think it would be better as an optional feature turned off by default.

auxym commented 10 years ago

Yeah, I agree. As much as I think about it, I can't really see a good reason to keep track of cited references both in the lyz db and in the bibtex file.

Frodox commented 9 years ago

I can confirm that updating of bibtex database stop working if someone just edit it outside, like change month = jul to month = {jul}, or change it's encoding to cp-1251 and so on. It is really good idea to just rewrite db during updating. Is it possible...? But... may be Lyz does not know which links was already inserted, so he parses already existed db and modify needed cites...?

wshanks commented 9 years ago

I think the default behavior should be for LyZ to regenerate the bib file every time the update BibTeX function is called, but it should also have the ability to update from the BibTeX file back to Zotero, and there should be a way to easily sync by updating one way and then the other. So far, I have not had time to read the LyZ code carefully to decide how to modify it to allow these different sync'ing options (when I have had time for this recently, I have used it to work on Zutilo). I do want to make these improvements, but I think the workaround of "don't edit the bib file by hand; edit the items in Zotero" is not that restrictive on usage.

Tobelix commented 8 years ago

Hi, got the same problem (can not update the .bib file) i looked at the firefox browser console and get the error: Parameter 0 is undefined in Zotero.DB.getStatement() [QUERY: SELECT ROWID FROM items WHERE libraryID IS NULL AND key=?] db.js:324:0 uncaught exception: Parameter 0 is undefined in Zotero.DB.getStatement() [QUERY: SELECT ROWID FROM items WHERE libraryID IS NULL AND key=?]

Further i checked my sqlite file but i cannot find any unwritten cells. Is there another possibility which causes this problem?

wshanks commented 8 years ago

Have you modified the .bib file by hand? LyZ stores Zotero item id's in the first line of the .bib file. The "Update BibTeX" function reads in the first line of the .bib file, retrieves all of the Zotero items corresponding to the ids stored in this first line, and then exports item data for those items back out as a new .bib file. The error you quote indicates that LyZ has encountered an error looking up one of the Zotero item id's it read in. The way this has been happening for people is that they modify the .bib file by hand so that LyZ ends up reading in some other text instead of the ids formatted in the way that it put them out before. This can happen without any issues with the sqlite file, but in some cases it can also cause LyZ to do strange things to the sqlite file.

From what I have seen so far, it seems like these issues can be avoided as long as you do not modify the .bib file by hand, or at least do not change the beginning of the .bib file where LyZ stores the Zotero ids.

I certainly would not call LyZ's design robust with regards to this issue. I didn't write LyZ. I just submitted an update to the code a couple years ago and the author asked me if I would take over maintenance of the project because he was not using LyZ any more. Ideally, I would modify LyZ to make it work more robustly but I have not been able to devote the necessary time to do that. So far I have updated LyZ to keep it compatible with changes to Zotero, Firefox, and LyX and merged any patches submitted by others but not made any changes to the way it works.

Tobelix commented 8 years ago

Hi and thank you for the request. I started the project a few months ago and therefore i can not remember if I have modiefied the .bib file in the beginning. However, I would not exclude this issue. Do you know any solution to fix this problem?

Thank you really much

wshanks commented 8 years ago

If you are not sure if you modified the .bib file, then you probably haven't made many changes to it at least. Would it be possible to recreate the .bib file from scratch? Go through the steps again to set up LyZ for your document and then select all of the references you are using in Zotero and do "Cite in LyX" on them. LyZ creates the cite keys automatically, so it should use the same ones that were used before, so you shouldn't have to change anything in the LyX document. Doing "Cite in LyX" will insert one giant reference into LyX for all references. You can delete this reference.

The potential tricky part is selecting all the references in Zotero. When I'm working on a document, I usually create a folder in Zotero for all of the document's references. Selecting all of the references is easy in that case. Also, you will lose any changes that you made by hand to the old .bib file but usually it is better to make changes in Zotero and then regenerate the .bib file than to modify the .bib file directly as we have discussed above.

Tobelix commented 8 years ago

Hmm ok i think this idea sound and the workload seems to be acceptable. I will try it in a few days and then i will report here again ;)

Thank you really much

Tobelix commented 8 years ago

So i just spent 2 hours to follow your advice and it works now. Thank you really much. I thuînk now I will even not open the .bib again :D Thank you again and kind regards

Tobi

wshanks commented 8 years ago

Ouch, sorry to hear that it took two hours to fix. If you keep all of the references you are using for something in a folder or give them all the same tag, it should be easy to recreate the bib file if you hit another problem like this.