Open ilc168 opened 8 years ago
Hi Ian, Thanks for your feedback. I don't want the users to do any clean-up to get it running properly, though the problem does seem to be related to some special characters in one or a few documents. I will try to debug that and get it patched up asap. In the mean time, if you are intended to do a migration over to Zotero, maybe you could export only to .bib format and not .ris format, and import to Zotero using the .bib output, because the error occurred when formatting the .ris entries. Do you also get incomplete outputs when exporting only to .bib format and ignoring .ris?
It will be helpful for me to get it right faster if you could provide me some more information regarding the error: It seems that the error you posted at the end of your post has something to do with a strange "page" field in one of your documents (entries.append('EP - %s' %str(pmatch.group(2))), here EP means end page). So if you could locate that problematic document, and let me know how that "pages" field look like, it will be very helpful. My guess is that it misses an end page, ie something like "456 - ". At the end of the messages, the programme will output all failed documents, so it should be in the list.
And the "No handlers could be found for logger "lib.pylatexenc.latexencode"" error, I couldn't really debug that because it gives no details. I've added one small fix to let it output more error messages, so if you could download the programme again from github and do a re-run, it should give more details regarding that error.
Finally the "added date" issue, I'll look into that. Thanks again for your feedback.
Thanks for getting back so quickly. Following the errors I did do some file clean-ups just in case (removed "-" and ":" etc from headings and author names, and rearranged some file structures to see if that was the case. Still having some errors and unfortunately now can't reproduce the error message above!
Yes I also get an error whether going to RIS or BIB. Here is the most recent one. I started using the GUI yesterday to isolate the issue and it seemed to be working well, but I'm starting again today with the GUI so will update once I have gone through that again.
Meanwhile, here is the error running the script. Could you clarify in interpreting the errors, is it tripping @ article 1/9 or 5/9? There is nothing seemingly abnormal with the articles that I can pick up though Redman (1/9) article is the only one in the folder that has a "contents".
/home/arjuna/.local/share/data/Mendeley Ltd./Mendeley
Desktop/****@gmail.com@www.mendeley.com.sqlite
================================ 1/11 ================================
------------------------------------------------------------------
# <Menotexport>: Exporting annotated PDFs ...
............................. 1/9 .............................
# <Menotexport>: Exporting PDF:
Redman - 2014 - Should sustainability and resilience be
combined or remain distinct pursuits.pdf
............................. 2/9 .............................
# <Menotexport>: Exporting PDF:
Mathie, Cameron, Gibson - Unknown - Asset-Based and
Citizen-Led Development Changing The Development
Conversation V2.pdf
............................. 3/9 .............................
# <Menotexport>: Exporting PDF:
Cheshire, Woods - 2009 - Citizenship and Governmentality,
Rural.pdf
............................. 4/9 .............................
# <Menotexport>: Exporting PDF:
Cote, Nightingale - 2012 - Resilience thinking meets
social theory Situating social change in socio-ecological
systems (SES) research.pdf
............................. 5/9 .............................
# <Menotexport>: Exporting PDF:
McGregor - 2009 - New possibilities Shifts in post-
development theory and practice.pdf
PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will not be corrected. [pdf.py:1503]
............................. 6/9 .............................
# <Menotexport>: Exporting PDF:
Mathie, Cameron, Gibson - 2014 - Asset-Based and Citizen-
Led Development Changing The Development
Conversation(2).pdf
............................. 7/9 .............................
# <Menotexport>: Exporting PDF:
Ireland, McKinnon - 2013 - Strategic localism for an
uncertain world A postdevelopment approach to climate
change adaptation(2).pdf
............................. 8/9 .............................
# <Menotexport>: Exporting PDF:
Cavalcanti - 2007 - Development versus enjoyment of life
a post-development critique of the developmentalist
worldview(2).pdf
............................. 9/9 .............................
# <Menotexport>: Exporting PDF:
Berkes, Ross - 2013 - Community Resilience Toward an
Integrated Approach.pdf
------------------------------------------------------------------
# <Menotexport>: Exporting un-annotated PDFs ...
............................. 1/2 .............................
# <Menotexport>: Copying file:
MacLeod, Emejulu - 2014 - Neoliberalism With a Community
Face A Critical Analysis of Asset-Based Community
Development in Scotland.pdf
............................. 2/2 .............................
# <Menotexport>: Copying file:
Walker, Cooper - 2011 - Genealogies of resilience From
systems ecology to the political economy of crisis
adaptation.pdf
------------------------------------------------------------------
# <Menotexport>: Extracting annotations from PDFs ...
............................. 1/9 .............................
# <Menotexport>: Processing file:
Redman - 2014 - Should sustainability and resilience be
combined or remain distinct pursuits.pdf
Traceback (most recent call last):
File "menotexport.py", line 1153, in
Hi Ian, Sorry for the trouble it has been causing. First thing, the new error of "ImportError: No module named pdfdocument" is very likely to be caused by an older version of the "pdfminer" module, I got one such report earlier, and the user got it solved by manually install a newer version (v2014+ is required, but the one in the Ubuntu repository was older than that and probably still is). The installation is not complicated fortunately, here is the link: https://euske.github.io/pdfminer/index.html. So after fixing this, it should be able to proceed along.
The GUI and command line versions are fundamentally the same, if the command line fails, the GUI one will too. You can choose whichever you like.
I'll try to quickly implement a "needs review list" along side those failing lists, so that the user could be warned of where the program may fail to get the metadata straight. But I don't want to delay your work. So if you could install a newer pdfminer module and let it process your mendeley library, even if an incomplete export, it still saves time. For those individual problematic docs, I may suggest you to do some manual work, just to get the migration done and proceed along with your own work. I also suggest leave this issue open while I do some further testing and fixing.
I haven't got time to look into the "added time" issue yet. So if you do the Zotero import now, I'm afraid the import won't be perfect.
Hello, I'm having some issues running your script, I'm using Ubuntu. I'm getting about half my folders and files converted and I get an incomplete. Error message is below. Is there any clean-up of the Mendeley data required to help this run (eg. removing special characters from author name or title, or shortening titles)? My next step is to go folder by folder in my library to see if I can isolate the issue.
Also noted that the "date added" field does not update as well on import in to zotero, so a request for future runs if this could be included. Thanks for the work you have put into this! Regards, Ian
No handlers could be found for logger "lib.pylatexenc.latexencode"
Traceback (most recent call last): File "menotexport.py", line 1144, in
args.separate,args.zotero,args.verbose)
File "menotexport.py", line 1007, in main
fidii,fnameii,allfolders,action,separate,iszotero,verbose)
File "menotexport.py", line 966, in processFolder
risfolder,allfolders,isfile,iszotero,verbose)
File "/home/arjuna/Downloads/Menotexport-master/lib/export2ris.py", line 284, in exportDoc2Ris
risdata=parseMeta(docii,basedir,isfile,iszotero)
File "/home/arjuna/Downloads/Menotexport-master/lib/export2ris.py", line 157, in parseMeta
entries.append('EP - %s' %str(pmatch.group(2)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xad' in position 0: ordinal not in range(128)