manubot / rootstock

Clone me to create your Manubot manuscript
https://manubot.github.io/rootstock/
Other
453 stars 178 forks source link

Advice to deal with supplementary material? #225

Open jmonlong opened 5 years ago

jmonlong commented 5 years ago

We tried out manubot for our manuscript and one part that required some manual work was dealing with the supplementary information. Maybe we missed something and there are easier ways to deal with that.

For example, is there a way to have automated supp fig/tables numbers. We used tag="S1" etc but then when we add a new supp fig we have to manually rename every following figure. I understand that this is more about pandoc-fignos than manubot but in practice how do manubot users deal with that?

The other thing was to separate the supplements from the main text. While writing, we added the supplement as a section but for submission we had to separate it. Ideally if there was a way to produce two documents (e.g. two pdfs) each with only the bibliography that it needs, but still linked in term of fig/table numbers. Something like what LaTeX does through the xr package.

dhimmel commented 5 years ago

Hey @jmonlong. Cool to see your usage of Manubot and congrats on reaching preprint stage.

is there a way to have automated supp fig/tables numbers

I've only done supplemental information once with Manubot for the Sci-Hub Coverage Study manuscript. We ended up doing a similar approach where we hardcoded the numbering like:

![
**Coverage by country of publication.**
Scopus assigns each journal a country of publication.
Sci-Hub's coverage is shown for countries with at least 100,000 articles.
](https://cdn.rawgit.com/greenelab/scihub/e35cc7b0d3b6dd65bf8ce18945007d2b44a6be1e/figure/coverage-by-country.svg){#fig:countries tag="4—figure supplement 1" width="100%"}

I agree it's not ideal and I don't have any suggestions that would help with your supplement.

Ideally if there was a way to produce two documents (e.g. two pdfs) each with only the bibliography that it needs, but still linked in term of fig/table numbers

Would it work to make sure the SI starts on a new PDF page at the end of the document (using {.page_break_before} in the header available in latest rootstock frontend), and then splitting the PDF into two?

I'll look into xr and think about this some more. One complicating factor is that it seems like eventually manual edits to the DOCX are required to meet submission requirements for a journal. Therefore, spending too much effort automating a tweak is not always that helpful (since other tweaks will have to be done manually).

dhimmel commented 5 years ago

P.S. I submitted a PR to upgrade your manuscript to the latest Manubot at https://github.com/jmonlong/manu-vgsv/pull/103. Excited to for any feedback on the new frontend features.

jmonlong commented 5 years ago

Thanks for the quick reply and the PR.

Would it work to make sure the SI starts on a new PDF page at the end of the document (using {.page_break_before} in the header available in latest rootstock frontend), and then splitting the PDF into two?

Yes I thought about that but in the end it made more sense to edit the DOCX output. That way we could add a TOC for the supplements and also have page numbers.

spending too much effort automating a tweak is not always that helpful (since other tweaks will have to be done manually).

Totally understand. I've always been a LaTeX user so I was trying to find a way to avoid any manual tweaks (and Word/LibreOffice) and do everything with manubot/pandoc. In the end going through DOCX it's not that bad, just a bit ugly.

slochower commented 5 years ago

For example, is there a way to have automated supp fig/tables numbers. We used tag="S1" etc but then when we add a new supp fig we have to manually rename every following figure. I understand that this is more about pandoc-fignos than manubot but in practice how do manubot users deal with that?

It may be worth noting that figure prefixes based on section will show up in a coming version of pandoc-crossref: https://github.com/lierdakil/pandoc-crossref/pull/174

dhimmel commented 5 years ago

Reminder to check out pandoc's --file-scope argument for potentially creating a supplement that can standalone from the main text:

Parse each file individually before combining for multifile documents. This will allow footnotes in different files with the same identifiers to work as expected. If this option is set, footnotes and links will not work across files. Reading binary files (docx, odt, epub) implies --file-scope.

It sounds like perhaps it doesn't make separate references sections, so perhaps it's not exactly the right solution.

adebali commented 4 years ago

@dhimmel How about having a pre-defined prefix for supplementary material and allowing usage of an option in the config.yaml file; supplementary: true.

If supplementary flag is true manubot can parse all but supplementary for the main text. A duplicate of the same manubot process can run for supplementary pages only.

For instance, we can dedicate `9_.md to supplementary material only.

We can also give a customization opportunity in the config file as well:

supplementary prefix: 9 or 91 etc.

We can ignore them for the main text and have a separate build process for the supplementary material only using 9*_*.md. With this, we would have a separate reference list for supplementary materials. Just like two separate manuscripts.

We can merge these "two" manuscripts at the very end.

olgabot commented 4 years ago

Running into this problem as well. I like the idea of having a separate supplementary material "manuscript" that then gets merged in the end. If there's tons and tons of figures (e.g. Nature's "Extended Data" section can be infinite..), it may make sense for the original manuscript to be e.g. figure numbers 001, 002, etc and supplemental to be 101, 102, etc.

agitter commented 4 years ago

Running the Manubot build process separately for the main text and supplement seems feasible. I expect that approach would sacrifice figure cross-referencing, i.e., I couldn't reference "Fig S1 from the main text.

Without cross-referencing, is having two manuscripts (main text and supplement) built from one repository more useful than having a separate supplement repository?

adebali commented 4 years ago

Without cross-referencing, is having two manuscripts (main text and supplement) built from one repository more useful than having a separate supplement repository?

@agitter I would say so.

I tend to have one repository for the entire project. This would include scripts, output files (images etc.). Therefore one output directory is useful to me until I know which figure goes to the main manuscript and which goes to supplementary.

Besides cross-referencing, another disadvantage of one repository would be the inability of having two reference lists separately for main ms and supplementary. I think references in the supplementary would be added to the actual reference list. This could be undesired.

kescobo commented 4 years ago

I'm just running into this problem as well. It's somewhat annoying to have to deal with this, since the whole idea of a "supplement" is based on the archaic practice of prioritizing print above everything else. But thems the breaks.

To summarize my ideal scenario (I think this echos a lot of what was already said):

Understandable that not all of this may be possible, of course :-)

slobentanzer commented 1 year ago

Hi all, great work on the package! I am using Manubot for two articles I am currently writing, so far it has been great.

I just stumbled across this discussion (since I also have a supplement I'd like to include), and was wondering whether there have been any integrations or general developments regarding this issue in Manubot. Is there documentation about the preferred way (or at least some way) of handling the supplement?

Many thanks!

agitter commented 1 year ago

@slobentanzer there hasn't been any Manubot development that makes it easier to elegantly handle supplements. There aren't any technical limitations I know of, and if we switch to pandoc-crossref for figure referencing it may enable the typical supplementary figure numbering (and other issues https://github.com/manubot/rootstock/issues/435#issuecomment-1072870077). However, no one has explored that.

The approach @dhimmel described above with the supplement is a separate subsection and content file and manual figure numbering may still be the best option.

slobentanzer commented 1 year ago

Hi @agitter, thanks for the update. I will look into it and maybe file a PR if I think it's worth the time / necessary. :)