pdftotext windows continually opening in foreground

sbaross commented 6 years ago

I'm running the GUI on Windows 10 but I'm basically unable to use my computer while it's running as there is almost continually opening terminal windows from pdftotext in the foreground. I'd estimate that 2 windows open & close a second so anything else I'm trying to use is flickering in and out of focus.

Most of the windows seem to be empty with no text, but occasionally some state "Syntax error: invalid font weight" (it's difficult to tell when they disappear so fast though!).

Is it possible to set these windows to open in the background or similar?

Xunius commented 6 years ago

Hi sbaross,

That sounds horrible. Could you let me know which version of win Menotexport did you use? And if you let it go flickering, does it terminate normally and give you the desired results? And if you are able to narrow it down a bit, for instance let it process only a folder that gives the problem and then some specific PDFs, that will be real helpful.

Sorry for the trouble.

sbaross commented 6 years ago

Hi, I used Menotexport v1.4 with poppler v0.67.0. I let it run over the weekend and it worked fine, seems like the program is all running as it should apart from these windows opening. The window appearing & disappearing rapidly happens with every PDF.

Now I've managed to run menotexport and import the .bib to Zotero, it doesn't seem to have extracted my notes as I'd want though. My notes from the "general notes" section in Mendeley have not been copied across, and every piece of text I've highlighted in the PDF has also been saved as individual notes which I do not want. I ran menotexport with every action ticked except "save separately". If I untick "extract highlights" will my highlights be preserved in the PDF but not saved as notes? And is it possible to also extract the general notes?

Finally, menotexport has merged the author keywords from Mendeley with my own tags. Is it possible to avoid this?

Xunius commented 6 years ago

Hi, I found a bug that could result in the "General notes" being missing and addressed that, so v1.5 contains that fix.

To group all highlighted texts into a single note, I think that's doable, one only needs to put them in a single {} in the bib entry (currently each highlight text has its only {}). I could make a separate version for you if you wish.

Yes, even if you untick "extract highlights" the highlights will still be preserved in the PDFs, only the texts won't be saved as notes (which also makes the process a lot faster).

Regarding the keywords being merged with tags, again I could compile a separate version for you if you don't want the tags. If that sounds OK I should be able to get back to you before Sat.

sbaross commented 6 years ago

Ah, I somehow missed v1.5!

I don't need the highlights to be extracted to notes so I'll just untick extract highlights next time. A version where I can ignore the keywords would be amazing though, there doesn't seem to be any way to easily delete all the author keywords in Mendeley either.

Xunius commented 6 years ago

Hi sbaross, Yep getting rid of keywords is easy. Just to double check that you don't want highlighted texts to appear in the zotero "Notes" tab, but want to get your "General notes", right? Currently extracting "General notes" will also get sticky notes as well, do you want them to each appear as a separate note entry in the Zotero tab, or all saved as a single note separated by empty line? UPDATE: actually after bit of experimenting I'm not quite sure I can make them separated by newlines in a single note, and that's why I initially put them all into separate entries. It seems that Zotero just removes all the new lines when reading the notes.

sbaross commented 6 years ago

Yeah, I want it so my Zotero notes doesn't contain the highlighted text from the PDF, but does contain my notes from Mendeley. I very rarely use the sticky notes in Mendeley to be honest so as long as they're somewhere I'm not too bothered! Separate notes is fine, it was just tricky to find my actual notes when the section also had all my highlights. Seems like just unchecking the extract highlights will fix that for me though.

Thanks so much for all your help!

Xunius commented 6 years ago

Alright here is a version that removes the keywords: https://drive.google.com/open?id=1yO_SG7DcO674nyD0siKo1K5ZXISEFUn4. Can you give it a try and see it works as expected?

sbaross commented 6 years ago

The new version doesn't seem to work. I've left it running in the background for a few hours now but it hasn't got past the first line after I press start:

================================1/16================================
# <Menotexport>: Processing folder: "RP1"

That folder only contains 13 articles so definitely shouldn't have taken that long.

Xunius commented 6 years ago

That's too bad, and I'm bit clueless as what's making it to hang up. Did you toggle the "export to ris" box? Can you try to use "export to bib" and not ris? Nope it shouldn't be ris issue.

Since you are not extracting highlight texts, the sqlite file should be the only input data. So if you feel comfortable sending me the sqlite database file, I could run the program for you, and send you back the .bib file (with the keywords from Mendeley removed), then you can use this bib file, in combination with the exported pdf files you got using the old version, to move on to do a zotero import. (I assume you are doing a zotero migration so it would be a one-time task, not something you would need to do on a regular basis.) And you also need to give me the path to the folder you chose to save the output using the old version (something like C:\Documents\Mendeley_export), because I need to replace the file paths in my .bib to that folder, so your zotero can pick up the attached pdfs at the correct locations. Seems more complicated as I expected. What do you think?

Alternatively you could maybe try some other tools. I found this one, it doesn't seem to give a GUI though so you have to do bit of command line. And you should be able to find some other tools that help zotero migration.

sbaross commented 6 years ago

I've just tried running v1.5 to see if I could at least run that and manually remove the author keywords but it seems to hang the same as the version you sent me. I've noticed when I open these version I get this pop-up which I didn't with v1.4. I don't know if they're related or if it gives you a clue what the issue is?

I'll try running it on my personal computer this weekend and see if it's something strange with my uni computer. If not, if you're willing to run it for me that would be a great help! I can always move my PDFs to a different folder to mirror your .bib instead of the other way round if that saves you a job?

Edit: I seem to be having the same problem on my other laptop. Getting a slightly different message on this one. Again the same thing happens with v1.5 but no issue with v1.4. Running in administrator mode doesn't help.

Xunius commented 6 years ago

Honestly I have no idea what's wrong, I packaged this python program in windows 7 running inside a virtual box, so it will be compatible with win 10 and 7. There was quite a gap between the 1.4 and 1.5 versions, I reinstalled the virtual machine, the tool used to package may have changed version. But it runs fine on my machine. It's beyond my knowledge. It wouldn't be a big trouble for me to change the folder path to your preference, I could just do a global search/replace, in fact you can do that too. If that sounds OK, you can send the sqlite file to my email (xugzhi1987@gmail.com).

sbaross commented 6 years ago

No worries, I'll email you the sqlite now. Thanks so much

Xunius / Menotexport

pdftotext windows continually opening in foreground #32