rmzelle / ref-extractor

Reference Extractor - Extract Zotero/Mendeley references from Microsoft Word files
https://rintze.zelle.me/ref-extractor/
MIT License
332 stars 20 forks source link

Reference Extractor not extracting all references #31

Closed jmgreyber closed 5 years ago

jmgreyber commented 5 years ago

Thanks for this amazing tool! I'm having an issue, though, with it not pulling all references. Can you help?

I am trying to pull from a .docx file that someone else created and sent to me. Document preferences seem to be set correctly, but the reference extractor is only pulling 15 of 20 unique records.

Any idea what I am doing wrong? TIA!

ref extractor snip 1 ref extractor snip 2

rmzelle commented 5 years ago

@jmgreyber, thanks for reaching out. If you can share the document (privately), or just a section with an in-text citation that doesn't extract, I can take a look. The problematic references are grey if you select them (and hence active fields), and not regular text?

If you toggle field codes with Alt+F9 you might also be able to see whether there is anything obviously different about the problematic reference fields.

rmzelle commented 5 years ago

And for public sharing (of an excerpt), you can just attach a Word document to a GitHub comment here. For private sharing, just send me a note via https://citationstyles.org/contact/#/contact-form, and I'll give you my email address in my reply so you can send me the Word document as an email attachment.

rmzelle commented 5 years ago

Closing because of inactivity. Happy to take a look if you get back to me.

rmzelle commented 5 years ago

Document preferences seem to be set correctly, but the reference extractor is only pulling 15 of 20 unique records.

Looking at your screenshot again, perhaps you're just misunderstanding the ref-extractor results ("15 references extracted (20 duplicates removed)"). In your case, ref-extractor found 35 references in your Word document, of which 20 had already been cited earlier in the document, so it removed those 20, and only kept the 15 that cited distinct items from your Zotero library.

jmgreyber commented 5 years ago

Hi! Thank you so much for circling back about this. I have been out of my office for a few days because of Jewish holidays. Your suggestion makes sense…I will take a look at it when I am back in my office.

Thanks, Jennifer

From: Rintze M. Zelle notifications@github.com Sent: Saturday, September 28, 2019 9:02 AM To: rmzelle/ref-extractor ref-extractor@noreply.github.com Cc: Jennifer Greyber jennifer.greyber@duke.edu; Mention mention@noreply.github.com Subject: Re: [rmzelle/ref-extractor] Reference Extractor not extracting all references (#31)

Document preferences seem to be set correctly, but the reference extractor is only pulling 15 of 20 unique records.

Looking at your screenshot again, perhaps you're just misunderstanding the ref-extractor results ("15 references extracted (20 duplicates removed)"). In your case, ref-extractor found 35 references in your Word document, of which 20 had already been cited earlier in the document, so it removed those 20, and only kept the 15 that cited distinct items from your Zotero library.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rmzelle_ref-2Dextractor_issues_31-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAMYWYAMD27NTJNMJZ3EMXXTQL5IS7A5CNFSM4IIRAAX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD72ZKGY-23issuecomment-2D536188187&d=DwMCaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=NfH1rskINUildZrdrCtCoITdd8yyp6QCf3y7PreTLbw&m=pTPrdVsAxLsOiMT0ftAPWqV5bnRTMvjzunhxOl4woIk&s=pLstYgOQ0o1SQtrsY9sg0Oh1ODGV4dfwZhUo_9tZiD4&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMYWYAL4YULPQRYB4LRP5VDQL5IS7ANCNFSM4IIRAAXQ&d=DwMCaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=NfH1rskINUildZrdrCtCoITdd8yyp6QCf3y7PreTLbw&m=pTPrdVsAxLsOiMT0ftAPWqV5bnRTMvjzunhxOl4woIk&s=42m_hOxNzBuCqb5HY4IOyjKAMVGJb2xBfiFCgCvWMVc&e=.