MartinPaulEve / meTypeset

meTypeset is a tool to convert from Microsoft Word .docx format to NLM/JATS-XML for scholarly/scientific article typesetting.
Other
89 stars 32 forks source link

Info request about references #147

Open pakojil opened 3 years ago

pakojil commented 3 years ago

Hello Martin First of all, thank you for your effort, and tell you that your tool seems great to me. However, I'm going crazy with the question of references, which always appear to me with the <mixed-citation>tag. Even having Zotero installed, and inserting the citations and references with Zotero, I can't get it to work properly. I work on MAC OS Catalina, and everything works for me except for the question I indicated above. I have followed the instructions, and I have put the path to the Zotero database, but I get nothing. I am also not able to use the bibscan option. Could you clarify for me in what format this NLM XML bibliography must be with an example file, and tell me from where and with what tool it is generated? And a complete command line example with that option, please. Thanks a lot

MartinPaulEve commented 3 years ago

Hi,

Can you please upload me your test document so I can see what you are doing?

Best wishes,

Martin

On Mon, 25 Jan 2021 at 20:39, pakojil notifications@github.com wrote:

Hello Martin First of all, thank you for your effort, and tell you that your tool seems great to me. However, I'm going crazy with the question of references, which always appear to me with the tag. Even having Zotero installed, and inserting the citations and references with Zotero, I can't get it to work properly. I work on MAC OS Catalina, and everything works for me except for the question I indicated above. I have followed the instructions, and I have put the path to the Zotero database, but I get nothing. I am also not able to use the bibscan option. Could you clarify for me in what format this NLM XML bibliography must be with an example file, and tell me from where and with what tool it is generated? And a complete command line example with that option, please. Thanks a lot

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MartinPaulEve/meTypeset/issues/147, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAH6RN3W7Q637SSBPCT2HZTS3XJH5ANCNFSM4WSKO3MQ .

-- Professor Martin Paul Eve Professor of Literature, Technology and Publishing Birkbeck, University of London

T: 0203 073 8420 E: martin.eve@bbk.ac.uk W: https://www.martineve.com R: 416, 43 Gordon Square, London, WC1H 0PD

Books: https://books.eve.gd Articles: https://articles.eve.gd

Series Editor: New Horizons in Contemporary Writing (Bloomsbury) Director, Birkbeck Centre for Technology and Publishing Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit (https://orbit.openlibhums.org) Senior Online Editor, Alluvium, (http://www.alluvium-journal.org)

pakojil commented 3 years ago

Thank you, @MartinPaulEve , I have prepared the same document in 3 versions:

Simple:

Normal text, with quotations in parentheses, and manual reference list. It identifies in-line citations well and does linking well in interactive mode:

<xref rid="id" ref-type="bibr" id="IDa9ba38b1-bd56-4d88-b7d3-e7fca59b184f">Erren, Shaw y Morfeld, 2016, p. 1431</xref>

But creates the reference with <mixed-citation> tag, both in non-interactive mode:

``

ERREN, T.C.; SHAW, D. y MORFELD, P. Analyzing the publish-or-perish paradigm with game theory: The prisoner’s dilemma and a possible escape. Science and Engineering Ethics, 2016, vol. 22, nº 5, p. 1431–1446. Disponible en: https://doi.org/10.1007/s11948-015-9701-x.
  </ref>``

As in interactive mode:

``

ERREN, T.C.; SHAW, D. y MORFELD, P. Analyzing the publish-or-perish paradigm with game theory: The prisoner’s dilemma and a possible escape. Science and Engineering Ethics, 2016, vol. 22, nº 5, p. 1431–1446. Disponible en: https://doi.org/10.1007/s11948-015-9701-x.
  </ref>``

Mendeley:

Same text, but using Mendeley addin. In interactive mode, linking is correct, but still uses the <mixed-citation> tag: ``

ERREN, T. C., SHAW, D., & MORFELD, P. (2016). Analyzing the publish-or-perish paradigm with game theory: The prisoner’s dilemma and a possible escape. Science and Engineering Ethics, 22(5), 1431–1446. https://doi.org/10.1007/s11948-015-9701-x.
  </ref>``

Zotero:

The same text, but citations and final references inserted using the Zotero Addon for Mac Word 2016. It does not do linking well either in interactive or non-interactive mode:

<xref rid="TO_LINK" ref-type="bibr" id="ID38989560-1e20-4878-bc3f-08662127bb3d">ERREN et al., 2016, p. 1431</xref>

It also fails in references, showed as <p> because the zotero module fails completely:

<p>ERREN, T.C.; SHAW, D. y MORFELD, P. Analyzing the publish-or-perish paradigm with game theory: The prisoner’s dilemma and a possible escape. Science and Engineering Ethics, 2016, vol. 22, nº 5, p. 1431–1446. Disponible en: https://doi.org/10.1007/s11948-015-9701-x.</p>

The console dump is:

python3 ./bin/meTypeset.py docx ./Tests/Martin_EVE/Zotero.docx ./tests/Martin_EVE/output/Zotero_interactive -d --nogit --zotero --interactive
[Main] Running at aggression level 10 [grrr!]
[Main] Metadata file wasn't specified. /meTypeset/metadata/metadataSample.xml
[DOCX to TEI] Unzipping ./Tests/Martin_EVE/Zotero.docx to ./tests/Martin_EVE/output/Zotero_interactive/docx
[DOCX to TEI] Looking for presence of media directory ./tests/Martin_EVE/output/Zotero_interactive/docx/word/media
[DOCX to TEI] Running saxon transform (DOCX->TEI)
[Metadata Manipulator] Extracted an article ID: "10.1000/123456" from metadata
[Metadata Manipulator] Extracted an article title: "A Sample Article Title" from metadata
[Metadata Manipulator] Extracted a journal title: "The Journal Title" from metadata
[Metadata Manipulator] Extracted a name component: "Eve" from metadata
[Metadata Manipulator] Extracted a name component: "Martin Paul" from metadata
[Metadata Manipulator] Extracted an affiliation: "University of Lincoln" from metadata
[Size Classifier] Changing title to size 100
[Size Classifier] Changing heading 1 to size 100
[Size Classifier] Changing heading 2 to size 90
[TEI Manipulator] Enclosing and changing size: {http://www.tei-c.org/ns/1.0}p to hi
[Size Classifier] Changing heading 3 to size 80
[Size Classifier] Changing heading 4 to size 70
[Size Classifier] Changing heading 5 to size 60
[Size Classifier] Changing heading 6 to size 50
[Size Classifier] Changing heading 7 to size 40
[Size Classifier] Changing heading 8 to size 30
[Size Classifier] Changing heading 9 to size 20
[Size Classifier] Changing Title to size 100
[Size Classifier] Changing Heading 1 to size 100
[Size Classifier] Changing Heading 2 to size 90
[Size Classifier] Changing Heading 3 to size 80
[Size Classifier] Changing Heading 4 to size 70
[Size Classifier] Changing Heading 5 to size 60
[Size Classifier] Changing Heading 6 to size 50
[Size Classifier] Changing Heading 7 to size 40
[Size Classifier] Changing Heading 8 to size 30
[Size Classifier] Changing Heading 9 to size 20
[Size Classifier] Changing H1 to size 100
[Size Classifier] Changing H2 to size 90
[Size Classifier] Changing H3 to size 80
[Size Classifier] Changing H4 to size 70
[Size Classifier] Changing H5 to size 60
[Size Classifier] Changing H6 to size 50
[Size Classifier] Changing H7 to size 40
[Size Classifier] Changing H8 to size 30
[Size Classifier] Changing H9 to size 20
[Size Classifier] Explicitly specified size variations and their frequency of occurrence: {'90.0': 1}
[Size Classifier] Size (90.0) greater than or equal to 16. Treating as a heading.
[Size Classifier] Normalizing nested headings inside cit/quote blocks
[TEI Manipulator] Assigned IDs to all headings
[Size Classifier] Set root size as 90.0
[Zotero Handler] Stashed 4 references for bibliography parsing
[List Classifier] Leaving enclosed reference processing
[List Classifier] Scanning for superscripted footnote entries
[List Classifier] Found no superscripted footnote entries
[TEI Manipulator] Handled deleted text
[TEI Manipulator] Replaced 0 WMF images with open equivalents
[TEI Manipulator] Removed 0 nodes during cleanup
[TEI Manipulator] Cleaned 0 nested item bibliographic tags during cleanup
[TEI to NLM] Running saxon transform (TEI->NLM)
[NLM Manipulator] Found 0 comment()[. = "meTypeset:br"] nodes on which to close and open tag p
[NLM Manipulator] Found 0 comment()[. = "meTypeset:br"] nodes on which to close and open tag title
[NLM Manipulator] Found 0 comment()[. = "meTypeset:br"] nodes on which to insert break: td
[NLM Manipulator] Found 0 comment()[. = "meTypeset:br"] nodes on which to insert break: title
[Reference Linker] Stripped disallowed tags from reference tree
[Reference Stub Linker Object] Successfully linked 1909 stub
[Reference Stub Linker Object] Successfully linked ERREN et al., 2016, p. 1431 stub
[Reference Stub Linker Object] Successfully linked Angell, 1986 stub from sub element tail
[Reference Stub Linker Object] Successfully linked WAGER et al., 2015 stub from sub element tail
[Reference Linker] Found no references to link
[Reference Linker] Entering interactive mode
--------------------------------------------------------------------------------
Found an unhandled reference marker: 1909
[S]kip, Delete, deleTe all, Enter search, Ibid, enter Link id, skip Rest,
show Context? e
Enter search term: humphreys
Candidates:
# selection (default 1), Skip, Delete, deleTe all, Enter search, Ibid,
enter Link id, skip Rest, show Context? r
Leaving interactive mode on user command
--------------------------------------------------------------------------------
Found an unhandled reference marker: ERREN et al., 2016, p. 1431
[S]kip, Delete, deleTe all, Enter search, Ibid, enter Link id, skip Rest,
show Context? e
Enter search term: erren
Candidates:
# selection (default 1), Skip, Delete, deleTe all, Enter search, Ibid,
enter Link id, skip Rest, show Context? s
Skipping this item
--------------------------------------------------------------------------------
Found an unhandled reference marker: Angell, 1986
[S]kip, Delete, deleTe all, Enter search, Ibid, enter Link id, skip Rest,
show Context? s
Skipping this item
--------------------------------------------------------------------------------
Found an unhandled reference marker: WAGER et al., 2015
[S]kip, Delete, deleTe all, Enter search, Ibid, enter Link id, skip Rest,
show Context? s
Skipping this item
[Caption Classifier] Attempting to classify captions for table objects
[Caption Classifier] Attempting to classify captions for graphics objects [plain]
[Caption Classifier] Attempting to classify captions for graphics objects [sibling]
[Caption Classifier] Attempting to correct any mis-nested graphics elements
[NLM Manipulator] Attempting to correct any mis-nested paragraph elements
[Metadata Manipulator] Running metadata transform
Traceback (most recent call last):
  File "/meTypeset/./bin/meTypeset.py", line 257, in <module>
    main()
  File "/meTypeset/./bin/meTypeset.py", line 253, in main
    me_typeset_instance.run()
  File "/meTypeset/./bin/meTypeset.py", line 245, in run
    self.run_modules()
  File "/meTypeset/./bin/meTypeset.py", line 229, in run_modules
    BibliographyDatabase(self.gv).run()
  File "/meTypeset/bin/bibliographydatabase.py", line 422, in run
    self.process_zotero()
  File "/meTypeset/bin/bibliographydatabase.py", line 350, in process_zotero
    from zotero import libzotero
  File "/meTypeset/bin/zotero/libzotero.py", line 29, in <module>
    from zotero_item import zoteroItem as zotero_item
ModuleNotFoundError: No module named 'zotero_item'

In the settings.xml file, I have indicated the path to my zotero.sqlite database, indicating the full path to the last forward slash before the file name: <mt:zotero>/Users/my_name/Zotero/</mt:zotero>

I think I'm doing the right thing, so I don't understand the error messages and the undesired functioning of the program. Maybe it's a problem with Word versions and Zotero addons. In that case, could you tell me which combination works correctly?

That is, what version of Word, what version of Zotero, which version of Zotero addon, and what operating system. I use macOS Catalina 10.15.7, Microsoft Word for Mac 16.45, Zotero 5.0.95, and the Zotero Word for Mac Integration 5.0.30.SA.5.0.95 addon.

On the other hand, could you tell me about the bibscan mode? I don't know what type of file it is (an example would be very appreciated), not how it is obtained (if it can be generated by export from Zotero, Mendeley or another application), or how it is actually used.

I don't know if you have an example of a docx file, and a small library to display a demo, with the command line correct syntax, and the explanations of the referred --options in your documentation.

Thank you very much for everything, and I apologize for the extensive and detailed message.

Mendeley.docx Simple.docx Zotero.docx