ReScience / template

Template for article submission
GNU General Public License v3.0
19 stars 24 forks source link

difficulty converting metadata file yaml -> tex #6

Closed mcbaneg closed 4 years ago

mcbaneg commented 4 years ago

Setup: Windows 10 (64-bit), MikTeX, Python 2.7 with ruamel.yaml installed, cygwin

I modified the metadata.yaml file.
I first tried 'make' as recommended, but I have Python 2.7 installed and the command failed at python3.

I then tried running it directly (within Windows PowerShell), and got

PS C:\virial\rescience-c\template-master> python ./yaml-to-latex.py -i metadata.yaml -o metadata.tex Traceback (most recent call last): File "./yaml-to-latex.py", line 71, in from article import Article File "C:\virial\rescience-c\template-master\article.py", line 4, in import yaml ImportError: No module named yaml

Evidently the module name in ruamel.yaml (which seems to be the most up-to-date yaml kit for python) is different from that expected. I don't see documentation on which module (or which python version) authors are expected to be using. Suggestions?

Thanks, G.

khinsen commented 4 years ago

The YAML package that you need is PyYAML. I have never heard about ruamel.yaml before. It seems to be a fork of PyYAML, but I have no idea how compatible it is. I don't know either if either YAML package supports Python 2.7 in a sufficiently recent version.

We should have clearer instructions for the template. I added them in other ReScience repositories, but not so far on this one. Unfortunately reproducibility is becoming a problem even for article submissions!

rougier commented 4 years ago

That's a clear but unfortunate illustration of the reproducibility problem. I think I designed the yaml-to-latex.py script with Python 3 and never tested it with Python 2. Even though the PyYAML package is quite standard, it might be good to add a pointer in the template repository.

By the way, if your entry is for the Ten Years Reproducibility Challenge, we'll soon give some proposals for author on what to write in the article. We're a bit late on that.

mcbaneg commented 4 years ago

It is for that challenge. I have a first draft of my paper already, if you’d like to see what somebody produced with just the hints that are already out there. Problems in my draft might help you figure out what to put in the guidelines ☺

My difficulty (other than the simple mechanics of submission) is that I’m not convinced the draft is interesting enough to be worth submitting. It does demonstrate that straightforward use of well-established tools (Fortran, ACM-TOMS routines, BLAS) is likely to produce long-lived programs if you just hang on to the source code and sample inputs.

-G.

George C. McBane (mcbaneg@gvsu.edumailto:mcbaneg@gvsu.edu) Professor of Chemistry Assistant Dean for Research, Facilities, and Analytics College of Liberal Arts and Sciences Grand Valley State University

B-4-229 Mackinac Hall, (616) 331-2506 Chemistry site http://www.gvsu.edu/chem/ CLAS site http://www.gvsu.edu/clas/ individual site http://faculty.gvsu.edu/mcbaneg/

From: Nicolas P. Rougier notifications@github.com Sent: Monday, November 18, 2019 6:52 AM To: ReScience/template template@noreply.github.com Cc: George McBane mcbaneg@gvsu.edu; Author author@noreply.github.com Subject: Re: [ReScience/template] difficulty converting metadata file yaml -> tex (#6)

That's a clear but unfortunate illustration of the reproducibility problem. I think I designed the yaml-to-latex.py script with Python 3 and never tested it with Python 2. Even though the PyYAML package is quite standard, it might be good to add a pointer in the template repository.

By the way, if your entry is for the Ten Years Reproducibility Challenge, we'll soon give some proposals for author on what to write in the article. We're a bit late on that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ReScience/template/issues/6?email_source=notifications&email_token=ANZTG5VKUFVGRGXSL6OKDMLQUJ6XXA5CNFSM4JONZHN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEKGAII#issuecomment-554983457, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANZTG5WTU7ZTGFFCR7KVZSTQUJ6XXANCNFSM4JONZHNQ.

khinsen commented 4 years ago

@mcbaneg Please do submit your work! Your conclusion is just the kind of outcome we'd like to see from the challenge: what has worked in the past to maintain code in working state, and what hasn't. If people submit only the difficult-to-reproduce cases, we might erroneously conclude that nothing has ever worked!

rougier commented 4 years ago

@mcbaneg As @khinsen suggest, you can submit and your submission will serve as a testbed for others. Among the things that might be interesting is:

  1. How did you conserve the sources
  2. Did you take care of registering RNG seed (if you use it)
  3. Did you save command line options (if you need some options)
  4. Did you need to adapt your sources ?
  5. Did you need to adapt your libraries ?
  6. What guided your choice of fortran among other languages at that time
  7. etc.
khinsen commented 4 years ago

Good points @rougier! I'd like to emphasize the utility of communicating the choices (and the motivations behind them) made at the time of publication, even if they risk being distorted by hindsight. That's something we can only get out of authors doing reproductions of their own work. For example, I realized that I never preserved or published code for reproducibility, but only to make it available for reuse by others. As a consequence, I am always missing the last small steps: command-line arguments, that five-line script that ties computations together, etc.

khinsen commented 4 years ago

This discussion really belongs in https://github.com/ReScience/ten-years/issues/4.

rougier commented 4 years ago

Yes, and we should start a author-instructions.md document. @mcbaneg Feel free to close the issue and let's continue discussion at https://github.com/ReScience/ten-years/issues/4.

mcbaneg commented 4 years ago

Okay, I have submitted a paper for the Ten Years challenge (3882888). I'm comfortable with it serving as trial case for working out both (1) what you hope for in a paper for this challenge, and (2) what submission conventions you want to use.

Two technical points: 1) I tried to drag-and-drop the metadata.yaml file into the submission issue box, but was given a "we do not accept that file type" message. It felt silly to retype into the box several bits of information that I had already put into the .yaml file.

2) I feel like your expectations for what tools authors will have immediately available are high enough to discourage some potential authors. They include:

make LaTeX, including latexmk Python 3.x with PyYAML (2.x can run PyYAML but its treatment of locales is different and it doesn't work with the template makefile for that reason) perl (necessary for latexmk)

Many potential authors will be familiar with all these tools, but may well not have them installed on the computers they normally use for preparing publications. I am familiar with them all, and use some (including make and LaTeX) daily. However, I had to install latexmk, perl, Python 3 (I had 2.7, and spent the time to find out it didn't work for this purpose), and PyYAML on my laptop to complete the submission.

rougier commented 4 years ago

Thanks !

Yes, good point about the large set of tools needed just to compile. Note that there's an overleaf template (that may need to be updated) that simplify things. Only problem is that you still need to generate the yaml file. We can also have a simple web-form to create the yaml file but I'm a bit clueless on how to do that.

For the metadata uploading, I'm not sure to get your point.

khinsen commented 4 years ago

@mcbaneg Thanks for the feedback on our submission procedure! We should probably simplify the required toolchain, ideally to the point that a TeXlive installation is sufficient to do everything. On the other hand, if that means reading YAML from TeX I'll probably change my mind!

@rougier If I understand our current setup correctly, make and latexmk are needed only for streamlining the PDF generation. We could perhaps provide a shell script that runs a worst-case scenario, assuming everything needs to be redone. But then, would a shell script work under Windows?

rougier commented 4 years ago

Yes, make and latexmk are not really needed. You can xelatex/biber/xelatex/xelatex and you're done. The metadata yaml and latex file can also be filled manually in case of problems (no need to use the python script to create the latex metadata from the yaml metadata, it's just more conveninet (if it runs)).