MartinPaulEve / meTypeset

meTypeset is a tool to convert from Microsoft Word .docx format to NLM/JATS-XML for scholarly/scientific article typesetting.
Other
89 stars 32 forks source link

testing /tests, some error messages #101

Open ppKrauss opened 7 years ago

ppKrauss commented 7 years ago

Trying to run all tests... No "installation test kit found", but easy to imagine some.

basic transform, docx

Using ls tests/*.docx and executing python bin/meTypeset.py docx tests/NAME.docx /tmp/tests/NAME for each file of the ls command. No error messages in all items except the following:

036 - standard python bin/meTypeset.py docx tests/036.docx /tmp/tests/036 = with -z option python bin/meTypeset.py docx -z tests/036.docx /tmp/tests/036 = "fldSimple: unrecognized type"

ADDIN ZOTERO_ITEM {
 "citationID":"U929WTFq",
 "properties":{
    "formattedCitation":"(2005)",
    "plainCitation":"(2005)"
 },
 "citationItems":[
  {
    "id":1462,
    "uris":["http://zotero.org/users/401864/items/UDAF6SS2"],
    "uri":["http://zotero.org/users/401864/items/UDAF6SS2"],
    "suppress-author":true
   }
  ]
} 

LinkSubElements - python bin/meTypeset.py docx tests/LinkSubElements.xml /tmp/tests/LinkSubElements = "Traceback (most recent call last):"

  File "bin/meTypeset.py", line 254, in <module>
    main()
  File "bin/meTypeset.py", line 250, in main
    me_typeset_instance.run()
  File "bin/meTypeset.py", line 242, in run
    self.run_modules()
  File "bin/meTypeset.py", line 152, in run_modules
    DocxToTei(self.gv).run(True, self.args['--proprietary'])
  File "/usr/local/meTypeset/bin/docxtotei.py", line 128, in run
    with zipfile.ZipFile(self.gv.input_file_path, "r") as z:
  File "/usr/lib/python2.7/zipfile.py", line 770, in __init__
    self._RealGetContents()
  File "/usr/lib/python2.7/zipfile.py", line 811, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file

WMF - python bin/meTypeset.py docx tests/WMF.docx /tmp/tests/WMF

W: Unknown node under /registry/extlang: deprecated
W: Unknown node under /registry/grandfathered: comments
W: Unknown node under /registry/grandfathered: comments

basic transform, other formats (.doc, etc.)

034 - python bin/meTypeset.py doc tests/034.doc /tmp/tests/034

W: Unknown node under /registry/extlang: deprecated
W: Unknown node under /registry/grandfathered: comments
W: Unknown node under /registry/grandfathered: comments

... more tests ...

later

MartinPaulEve commented 7 years ago

Hi - you need to install the robot framework to run the tests. RIDE can be used to give a visual interface to this. All tests are currently passing in the correct framework.

On Fri, 11 Nov 2016, 22:06 Peter, notifications@github.com wrote:

Trying to run all tests... No "installation test kit found", but easy to imagine some. basic transform, docx

Using ls tests/*.docx and executing python bin/meTypeset.py docx tests/NAME.docx /tmp/tests/NAME for each file of the ls command. No error messages in all items except the following:

036 - standard python bin/meTypeset.py docx tests/036.docx /tmp/tests/036 = with -z option python bin/meTypeset.py docx -z tests/036.docx /tmp/tests/036 = "fldSimple: unrecognized type"

ADDIN ZOTERO_ITEM { "citationID":"U929WTFq", "properties":{ "formattedCitation":"(2005)", "plainCitation":"(2005)" }, "citationItems":[ { "id":1462, "uris":["http://zotero.org/users/401864/items/UDAF6SS2"], "uri":["http://zotero.org/users/401864/items/UDAF6SS2"], "suppress-author":true } ] }

LinkSubElements - python bin/meTypeset.py docx tests/LinkSubElements.xml /tmp/tests/LinkSubElements = "Traceback (most recent call last):"

File "bin/meTypeset.py", line 254, in main() File "bin/meTypeset.py", line 250, in main me_typeset_instance.run() File "bin/meTypeset.py", line 242, in run self.run_modules() File "bin/meTypeset.py", line 152, in run_modules DocxToTei(self.gv).run(True, self.args['--proprietary']) File "/usr/local/meTypeset/bin/docxtotei.py", line 128, in run with zipfile.ZipFile(self.gv.input_file_path, "r") as z: File "/usr/lib/python2.7/zipfile.py", line 770, in init self._RealGetContents() File "/usr/lib/python2.7/zipfile.py", line 811, in _RealGetContents raise BadZipfile, "File is not a zip file" zipfile.BadZipfile: File is not a zip file

WMF - python bin/meTypeset.py docx tests/WMF.docx /tmp/tests/WMF

W: Unknown node under /registry/extlang: deprecated W: Unknown node under /registry/grandfathered: comments W: Unknown node under /registry/grandfathered: comments

basic transform, other formats (.doc, etc.)

034 - python bin/meTypeset.py doc tests/034.doc /tmp/tests/034

W: Unknown node under /registry/extlang: deprecated W: Unknown node under /registry/grandfathered: comments W: Unknown node under /registry/grandfathered: comments

... more tests ...

later

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MartinPaulEve/meTypeset/issues/101, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_otxZdxFvWelodCJRy-sl4bTW1WaVJks5q9ObjgaJpZM4KwL8T .

ppKrauss commented 7 years ago

Hi, thanks to the feedback (!). The Canadian RIDE? ;-)

Hum... But if "all tests are currently passing", why indicated command lines not works?
Perhaps I need to install something more...

MartinPaulEve commented 7 years ago

The "W:" errors are from unoconv, not meTypeset.

The command bin/meTypeset.py docx tests/LinkSubElements.xml won't work since LinkSubElements.xml is not a docx file, it's an XML file.

Basically, the test documents are designed to exercise specific parts of the stack, not to just test whether in general it's working. If you install Robot and RIDE and there are still errors on your system running the test suite, I'll be happy to investigate.

ppKrauss commented 7 years ago

About "install Robot and RIDE", it is pip install robotframework-ride? At my UBUNTU env the message was "Successfully installed robotframework-ride-1.5.2.1". But no changes... Perhaps, to run "meTypeset test kit", need a specific command (how to run all tests? how to point the tests folder or some "make test" to Robot Framework?)

To check if the env changed, I am running again the same tests for LinkSubElements.docx and 036.docx... again the same erros.


For my personal use (motivation) of /tests folder, I need only to check if my meTypeset environment is exactly yours: let's label this motivation check-tests.

About your explanation "the test documents are designed to exercise specific parts of the stack", perhaps check-tests is a new proposal: use a subset of /tests to check a fresh meTypeset installation... I can specify this subset.

EXAMPLE: as I show, for meTypeset.py docx check-tests, the subset is   S1={001.docx, 002.docx, ..., WMF.docx}   with n(S1)=56-2 or n(S1)=56-4, where the excluded files are the files with unoconv error or any other that need to avoid in the check-test.

MartinPaulEve commented 7 years ago

If RIDE and Robot are correctly installed you should be able to just run ride.py and then load up the interface to run the tests...

On 12/11/16 12:37, Peter wrote:

About "install Robot and RIDE", it is |pip install robotframework-ride|? At my UBUNTU env the message was "Successfully installed robotframework-ride-1.5.2.1". But no changes... Perhaps, to run "meTypeset test kit", need a specific command (how to run all tests? how to point the tests folder or some "make test" to Robot Framework?)

To check if the env changed, I am running again the same tests for |LinkSubElements.docx| and |036.docx|... again the same erros.


For my personal use (motivation) of |/tests| folder, I need only to check if my meTypeset environment is exactly yours: let's label this motivation /check-tests/.

About your explanation "the test documents are designed to exercise specific parts of the stack", perhaps /check-tests/ is a new proposal: use a subset of //tests/ to check a fresh meTypeset installation... I can specify this subset.

EXAMPLE: as I show, for |meTypeset.py docx| check-tests, the subset is /S1={001.docx, 002.docx, ..., WMF.docx}/ with /n(S1)=56-2/ or /n(S1)=56-4/, where the excluded files are the files with |unoconv| error or any other that need to avoid in the check-test.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MartinPaulEve/meTypeset/issues/101#issuecomment-260119812, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_ot860mg_HKwOe_nT_0Muaeczz6l19ks5q9bMjgaJpZM4KwL8T.

Professor Martin Paul Eve Chair of Literature, Technology and Publishing Birkbeck, University of London

T: 0203 073 8420 E: martin.eve@bbk.ac.uk W: https://www.martineve.com R: 416, 43 Gordon Square, London, WC1H 0PD

Books: https://www.martineve.com/books/ Articles: https://www.martineve.com/c-v/

Series Editor: New Horizons in Contemporary Writing (Bloomsbury) Director, Birkbeck Centre for Technology and Publishing Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit (https://www.pynchon.net) Senior Online Editor, Alluvium, (http://www.alluvium-journal.org)

ppKrauss commented 7 years ago

Hi, thanks for all (!), I am waiting RIDE reply.

... Independent of testings, next step I am studying how to collaborate: we have same filters that can adapted, to complement meTypeset with, eg. HTML input (before html2tei), table header recognition/parsing, front-metadata extraction. PS: my focus is on DOCX, ODT and HTML inputs, and JATS output.

MartinPaulEve commented 7 years ago

Hi Peter,

Thanks again for this...

Have you tried with python-wxgtk3.0?

Best wishes,

Martin

On 16/11/16 02:45, Peter wrote:

Hi, thanks for all (!), I am waiting RIDE reply https://github.com/robotframework/RIDE/issues/1666.

... Independent of testings, next step I am studying how to collaborate: we have same filters that can adapted, to complement meTypeset with, eg. HTML input (before html2tei), table header recognition/parsing, front-metadata extraction. PS: my focus is on DOCX, ODT and HTML inputs, and JATS output.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MartinPaulEve/meTypeset/issues/101#issuecomment-260887340, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_ot5dMHU2RNtpHCRKLJZoZk-la6f7yks5q-sKdgaJpZM4KwL8T.

Professor Martin Paul Eve Chair of Literature, Technology and Publishing Birkbeck, University of London

T: 0203 073 8420 E: martin.eve@bbk.ac.uk W: https://www.martineve.com R: 416, 43 Gordon Square, London, WC1H 0PD

Books: https://www.martineve.com/books/ Articles: https://www.martineve.com/c-v/

Series Editor: New Horizons in Contemporary Writing (Bloomsbury) Director, Birkbeck Centre for Technology and Publishing Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit (https://www.pynchon.net) Senior Online Editor, Alluvium, (http://www.alluvium-journal.org)

ppKrauss commented 7 years ago

Thanks Martin, apt-get install python-wxgtk3.0 works fine, installed, but not changes the reported problems.

ride.py
Wrong wxPython version.
You need to install wxPython 2.8.12.1 with unicode support to run RIDE.

... seems a kind of trap in a dependency hell... ;-)

PS: no news at RIDE community.

axfelix commented 7 years ago

Hi @ppKrauss ,

For what it's worth, I've never used RIDE for the tests -- the only problem I ever had running them was when I was on a platform that had python3 as the default python on the path and needed to call python2 directly instead :)

You should be able to just do cd tests && pybot *.txt.

axfelix commented 7 years ago

Hi again @ppKrauss,

I've just used the server login you sent me and it looks like you didn't actually have pybot installed to run the tests -- I've just installed it via pip install robotframework and everything appears to be passing in your installation.

ppKrauss commented 7 years ago

! thanks @axfelix !!