Closed mbroedl closed 6 years ago
Just spoke to my friend who uses autocomplete-bibtex with Mendeley auto-export, and she experiences the same problem.
Thought about the problem more: probably since the file is likely to be written as an output stream, the file-change event listener fires many times (because the file changes continuously) instead of just once when the file is fully written (and is the case when saving it from a text editor).
Suggested solution: Maybe one could buffer the file-change events and when the last file-change event is maybe two seconds ago (could handle that via a promise design, maybe?), only then reload the bibtex file?
@mbroedl what OS are you on? This was built and test on mac where it seemed to work fine but I noticed issues on windows.
Myself I'm on Linux, my friend is on Windows.
@mbroedl I've updated the file loading the be asynchronous, which should solve this. Would be great if you could test the development version before I push this out to everyone! Note that if you have a really huge bibtex file, it will take some processing to parse it, but this should now at least not lock up the rest of the editor
Mhmmm... so when I make one change only, it's alright with this version. It still seems to block atom, but at least it goes back to normal (with my bibtex file) after a few seconds.
However, sometimes it doesn't and I tried to investigate why, so I added to provider.js -> update()
console.log(+ Date.now(), 'reading changed bibtex')
Outcome: In my case, it calls the function update twice on startup, and twice for every change made in zotero (which is probably an issue with the zotero exporter as well?). However, when I continue doing that, there is more and more re-reads of the bibtex file than changes I have made, and some still occur ages later.
So over the course of a minute or two of working in zotero (moving things, retagging, importing new files, etc), the update-function is called quite a lot of times (would assume like 100+ times) which even asynchronously blocks atom. Plus: this error came a few times in between, probably when the async reading is so delayed that the file is written again?
lite-bibtex-parse.js [sm]:298 Uncaught (in promise) TypeError: Unterminated value: value_braces at BibtexParser.value_braces (lite-bibtex-parse.js [sm]:278) at BibtexParser.single_value (lite-bibtex-parse.js [sm]:315) at BibtexParser.value (lite-bibtex-parse.js [sm]:331) at lite-bibtex-parse.js [sm]:371 at BibtexParser.key_equals_value (lite-bibtex-parse.js [sm]:385) at BibtexParser.key_value_list (lite-bibtex-parse.js [sm]:400) at BibtexParser.entry_body (lite-bibtex-parse.js [sm]:408) at BibtexParser.entry (lite-bibtex-parse.js [sm]:429) at BibtexParser.parse (lite-bibtex-parse.js [sm]:470) at Object.toJSON (lite-bibtex-parse.js [sm]:478) at BibtexReader.
(bibtex-reader.js [sm]:5) at Generator.next ( ) at step (bibtex-reader.js [sm]:1) value_braces @ lite-bibtex-parse.js [sm]:298 single_value @ lite-bibtex-parse.js [sm]:334 value @ lite-bibtex-parse.js [sm]:348 (anonymous) @ lite-bibtex-parse.js [sm]:387 key_equals_value @ lite-bibtex-parse.js [sm]:404 key_value_list @ lite-bibtex-parse.js [sm]:421 entry_body @ lite-bibtex-parse.js [sm]:435 entry @ lite-bibtex-parse.js [sm]:465 parse @ lite-bibtex-parse.js [sm]:511 toJSON @ lite-bibtex-parse.js [sm]:522 (anonymous) @ bibtex-reader.js [sm]:15 step @ bibtex-reader.js [sm]:1
Is there a way to trigger reloads only when the window is in focus? Because I won't need an updated file in atom, when still working in zotero, for example. And if something in zotero happens asynchronously (e.g. adding a paper) while the focus is on atom, it should not involve too many changes to the database and thus updates.
Sorry for this... :/
I will leave a bug report with the zotero extension to build in a buffer, as well!
The autoexport extension for zotero has a setting to only export when idle. That should fix my problem for now!
@mbroedl thanks so much for digging into this! For sure the error you are seeing is due to the system trying to parse a partially written file. I think you are right that it's caused by reading being delayed so much that it starts when the file is getting written again. What is ideally needed is a way to interrupt parsing if a new change occurs and start again, I hope to find a way for this. On the other hand I'm not sure if there is a way to delay until the window regains focus. Glad you have a workaround for now.
I've just experimented a bit. Since the events seemed to be bundled in the course of 1 or 2 seconds, you could run timeouts to reread the reference files (or maybe even to call update).
One could do the following to re-read 1000 ms after the last time update() was called:
clearTimeout(timeout);
timeout = setTimeout(() => {
// update, readReference, or whatever
}, 1000);
Otherwise, maybe in lite-bibtex-parse.js one could add a cancellation variable, so that
while (this.matchAt()) {
in the parse() function becomes
while (this.matchAt() && timestampWhenParserCalled == lastCalledTimestamp) {
where lastCalledTimestamp could be a global(?) variable?
Still, it might be good to connect this to the first option (maybe with 200ms or so), to avoid double-reading the file (which is done by a file-handler so probably a bit more difficult to stop).
Otherwise, re-reading on focus would be something like (just making this up now, none of the functions is correct):
event.onFileChange => {
if window.hasFocus() { update(); }
else {
if noEventOnRefocus {
register EventOnRefocus => {
destroy EventOnRefocus;
update();
}
}
}
}
I would really like to be able to interrupt the parsing, and probably a global variable is the way to go, although I'm also considering how i could offload the work to a child process or Web Worker (which would be neat in that it would use parallel processing) - especially since I think the parsing is what is taking time and locking up the system, even a 'big' bibtex file is only a few MB which should be very fast to read.
@mbroedl I have tried offloading the bibtex processing to a web-worked, would be great if you could test it! It should now not lock up the editor when processing. However, i didn't yet implement interrupting the updates when new changes happen because i need to decide what the best way to structure everything is.
Good news: it did not block atom (although it was slow). Atom's electron process took up more than one processor now and all seemed super busy. (EDIT: it was slow when doing a lot of stuff in zotero and then changing to atom; when doing one change, I couldn't find a performance change, and it did not block at all.)
But it threw a lot of error messages when just performing a lot of changes:
many times, once already on startup:
/usr/lib/atom/src/state-store.js:49 Uncaught (in promise) DOMException: Failed to execute 'put' on 'IDBObjectStore': #<Promise> could not be cloned.
at dbPromise.then (/usr/lib/atom/src/state-store.js:49:12)
at <anonymous>
many times:
bibtex-reader.js? [sm]:37 Uncaught TypeError: Cannot read property 'terminate' of null
at Worker._this.bibReaderWorker.onmessage (bibtex-reader.js? [sm]:23)
_this.bibReaderWorker.onmessage @ bibtex-reader.js? [sm]:37
once, probably because too many updates on the file happened:
(node:17586) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 21 ipc-helpers-window-method-response listeners added. Use emitter.setMaxListeners() to increase limit
process.on @ internal/process/warning.js:21
emitOne @ events.js:96
emit @ events.js:191
process.nextTick @ internal/process/warning.js:47
_combinedTickCallback @ internal/process/next_tick.js:73
_tickCallback @ internal/process/next_tick.js:104
/usr/lib/atom/src/state-store.js:49 Uncaught (in promise) DOMException: Failed to execute 'put' on 'IDBObjectStore': #<Promise> could not be cloned.
at dbPromise.then (/usr/lib/atom/src/state-store.js:49:12)
at <anonymous>
dbPromise.then @ /usr/lib/atom/src/state-store.js:49
Without doing any changes, error one comes maybe every 2 minutes. When only doing one change, so possibly 2 calls to update as before? There were maybe 5-10 more of these. The update of the database worked nonetheless.
Atom 1.23.3 Electron 1.7.11 Node 9.4.0
@mbroedl I've made some more improvements, most importantly to limit the number of times the update function is called when the file changes to no more than once every 2 seconds or so, and to check that the file has finished writing before reading. Let me know if this improves the situation for you.
Works wonderfully. Processor load never went up noticeably, and electron did not block!
The only peculiarity (but I don't think it's a reasonable problem) is when the dialog for autocompletion is open, and atom has reloaded the file, the dialog is not reloaded (i.e. the old file is used). The new/changed entries only show up when typing/deleting one character (as in: autocomplete is called again). But that's an unlikely case anyway.
Thanks a lot!
One thing: Atom notified me that I needed a module fuse.js when I first started it with that module. But I guess that would be taken care of by apm in the production version?
Great! I think the issue you mentioned are ok. fuse.js should be installed in the production version automatically, will close the issue then.
I use a large (1.6MB) file exported from zotero with autocomplete-bibtex. Runs great, especially since the performance improvements. :)
At the moment, I use the zotero-better-bibtex extension (since I use autocomplete-bibtex luckily not anymore to manage citation keys) to auto-export the file. It seems like it overwrites the .bib file in place. While it's great that autocomplete-bibtex notices change, electron takes up a whole processor's capacity whenever the file changes, and does not stop anymore even after minutes, whereas a startup with a new file is not a problem at all.
Further details:
Concluding, I assume the problem is that the file is written as an output stream. I could post a bug report on better-bibtex as well, ask them to not write as a stream in-place.
Possible solutions:
Probably there is more, not sure.