Xunius / Menotexport

Python solution to export annotations from your Mendeley library.
GNU General Public License v3.0
124 stars 20 forks source link

No handlers could be found for logger "lib.pylatexenc.latexencode" #5

Open ilc168 opened 8 years ago

ilc168 commented 8 years ago

Hello, I'm having some issues running your script, I'm using Ubuntu. I'm getting about half my folders and files converted and I get an incomplete. Error message is below. Is there any clean-up of the Mendeley data required to help this run (eg. removing special characters from author name or title, or shortening titles)? My next step is to go folder by folder in my library to see if I can isolate the issue.

Also noted that the "date added" field does not update as well on import in to zotero, so a request for future runs if this could be included. Thanks for the work you have put into this! Regards, Ian


# <Menotexport>: Exporting meta-data and annotations to .bib
file...

No handlers could be found for logger "lib.pylatexenc.latexencode"

------------------------------------------------------------------
# <Menotexport>: Exporting meta-data and annotations to .ris
file...

Traceback (most recent call last): File "menotexport.py", line 1144, in args.separate,args.zotero,args.verbose) File "menotexport.py", line 1007, in main fidii,fnameii,allfolders,action,separate,iszotero,verbose) File "menotexport.py", line 966, in processFolder risfolder,allfolders,isfile,iszotero,verbose) File "/home/arjuna/Downloads/Menotexport-master/lib/export2ris.py", line 284, in exportDoc2Ris risdata=parseMeta(docii,basedir,isfile,iszotero) File "/home/arjuna/Downloads/Menotexport-master/lib/export2ris.py", line 157, in parseMeta entries.append('EP - %s' %str(pmatch.group(2))) UnicodeEncodeError: 'ascii' codec can't encode character u'\xad' in position 0: ordinal not in range(128)

Xunius commented 8 years ago

Hi Ian, Thanks for your feedback. I don't want the users to do any clean-up to get it running properly, though the problem does seem to be related to some special characters in one or a few documents. I will try to debug that and get it patched up asap. In the mean time, if you are intended to do a migration over to Zotero, maybe you could export only to .bib format and not .ris format, and import to Zotero using the .bib output, because the error occurred when formatting the .ris entries. Do you also get incomplete outputs when exporting only to .bib format and ignoring .ris?

It will be helpful for me to get it right faster if you could provide me some more information regarding the error: It seems that the error you posted at the end of your post has something to do with a strange "page" field in one of your documents (entries.append('EP - %s' %str(pmatch.group(2))), here EP means end page). So if you could locate that problematic document, and let me know how that "pages" field look like, it will be very helpful. My guess is that it misses an end page, ie something like "456 - ". At the end of the messages, the programme will output all failed documents, so it should be in the list.

And the "No handlers could be found for logger "lib.pylatexenc.latexencode"" error, I couldn't really debug that because it gives no details. I've added one small fix to let it output more error messages, so if you could download the programme again from github and do a re-run, it should give more details regarding that error.

Finally the "added date" issue, I'll look into that. Thanks again for your feedback.

ilc168 commented 8 years ago

Thanks for getting back so quickly. Following the errors I did do some file clean-ups just in case (removed "-" and ":" etc from headings and author names, and rearranged some file structures to see if that was the case. Still having some errors and unfortunately now can't reproduce the error message above!

Yes I also get an error whether going to RIS or BIB. Here is the most recent one. I started using the GUI yesterday to isolate the issue and it seemed to be working well, but I'm starting again today with the GUI so will update once I have gone through that again.

Meanwhile, here is the error running the script. Could you clarify in interpreting the errors, is it tripping @ article 1/9 or 5/9? There is nothing seemingly abnormal with the articles that I can pick up though Redman (1/9) article is the only one in the folder that has a "contents".

: Connected to database:

 /home/arjuna/.local/share/data/Mendeley Ltd./Mendeley
Desktop/****@gmail.com@www.mendeley.com.sqlite

================================ 1/11 ================================

: Processing folder: "Jai"

------------------------------------------------------------------
# <Menotexport>: Exporting annotated PDFs ...

    ............................. 1/9 .............................
    # <Menotexport>: Exporting PDF:

         Redman - 2014 - Should sustainability and resilience be
        combined or remain distinct pursuits.pdf

    ............................. 2/9 .............................
    # <Menotexport>: Exporting PDF:

         Mathie, Cameron, Gibson - Unknown - Asset-Based and
        Citizen-Led Development Changing The Development
        Conversation V2.pdf

    ............................. 3/9 .............................
    # <Menotexport>: Exporting PDF:

         Cheshire, Woods - 2009 - Citizenship and Governmentality,
        Rural.pdf

    ............................. 4/9 .............................
    # <Menotexport>: Exporting PDF:

         Cote, Nightingale - 2012 - Resilience thinking meets
        social theory Situating social change in socio-ecological
        systems (SES) research.pdf

    ............................. 5/9 .............................
    # <Menotexport>: Exporting PDF:

         McGregor - 2009 - New possibilities Shifts in post-
        development theory and practice.pdf

PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will not be corrected. [pdf.py:1503]

    ............................. 6/9 .............................
    # <Menotexport>: Exporting PDF:

         Mathie, Cameron, Gibson - 2014 - Asset-Based and Citizen-
        Led Development Changing The Development
        Conversation(2).pdf

    ............................. 7/9 .............................
    # <Menotexport>: Exporting PDF:

         Ireland, McKinnon - 2013 - Strategic localism for an
        uncertain world A postdevelopment approach to climate
        change adaptation(2).pdf

    ............................. 8/9 .............................
    # <Menotexport>: Exporting PDF:

         Cavalcanti - 2007 - Development versus enjoyment of life
        a post-development critique of the developmentalist
        worldview(2).pdf

    ............................. 9/9 .............................
    # <Menotexport>: Exporting PDF:

         Berkes, Ross - 2013 - Community Resilience Toward an
        Integrated Approach.pdf

------------------------------------------------------------------
# <Menotexport>: Exporting un-annotated PDFs ...

    ............................. 1/2 .............................
    # <Menotexport>: Copying file:

         MacLeod, Emejulu - 2014 - Neoliberalism With a Community
        Face A Critical Analysis of Asset-Based Community
        Development in Scotland.pdf

    ............................. 2/2 .............................
    # <Menotexport>: Copying file:

         Walker, Cooper - 2011 - Genealogies of resilience From
        systems ecology to the political economy of crisis
        adaptation.pdf

------------------------------------------------------------------
# <Menotexport>: Extracting annotations from PDFs ...

    ............................. 1/9 .............................
    # <Menotexport>: Processing file:

         Redman - 2014 - Should sustainability and resilience be
        combined or remain distinct pursuits.pdf

Traceback (most recent call last): File "menotexport.py", line 1153, in args.separate,args.zotero,args.verbose) File "menotexport.py", line 1009, in main fidii,fnameii,allfolders,action,separate,iszotero,verbose) File "menotexport.py", line 907, in processFolder annotations,flist=extractAnnos(annotations,action,verbose) File "menotexport.py", line 797, in extractAnnos from lib import extracthl2 File "/home/arjuna/Downloads/Menotexport-master/lib/extracthl2.py", line 23, in from pdfminer.pdfdocument import PDFDocument ImportError: No module named pdfdocument

Xunius commented 8 years ago

Hi Ian, Sorry for the trouble it has been causing. First thing, the new error of "ImportError: No module named pdfdocument" is very likely to be caused by an older version of the "pdfminer" module, I got one such report earlier, and the user got it solved by manually install a newer version (v2014+ is required, but the one in the Ubuntu repository was older than that and probably still is). The installation is not complicated fortunately, here is the link: https://euske.github.io/pdfminer/index.html. So after fixing this, it should be able to proceed along.

The GUI and command line versions are fundamentally the same, if the command line fails, the GUI one will too. You can choose whichever you like.

I'll try to quickly implement a "needs review list" along side those failing lists, so that the user could be warned of where the program may fail to get the metadata straight. But I don't want to delay your work. So if you could install a newer pdfminer module and let it process your mendeley library, even if an incomplete export, it still saves time. For those individual problematic docs, I may suggest you to do some manual work, just to get the migration done and proceed along with your own work. I also suggest leave this issue open while I do some further testing and fixing.

I haven't got time to look into the "added time" issue yet. So if you do the Zotero import now, I'm afraid the import won't be perfect.