shimming-toolbox / rf-shimming-7t

Repository for the paper "B1+ shimming in the cervical spinal cord at 7T"
https://shimming-toolbox.github.io/rf-shimming-7t/
MIT License
1 stars 1 forks source link

Steps towards a NeuroLibre pre-print publication #42

Open jcohenadad opened 8 months ago

jcohenadad commented 8 months ago

We plan to submit the manuscript as a preprint to NeuroLibre. In this issue we discuss what it takes, who is doing what, etc.

mathieuboudreau commented 8 months ago

I told Nikola I'd give a crack at it for some time I have available this week to get an idea of how quick that might get done; I'll start a new repo for it just because I need to mess with some github settings to get it working with Jupyter Book

mathieuboudreau commented 8 months ago

Repo -> https://github.com/shimming-toolbox/rf-shimming-7t-neurolibre

jcohenadad commented 7 months ago

@mathieuboudreau we aim at submitting the manuscript in ~1w from now. Do you have an ETA for the neurolibre?

mathieuboudreau commented 7 months ago

@jcohenadad A few update points and general points before answering:

Link: https://shimming-toolbox.github.io/rf-shimming-7t-neurolibre/

As for getting it published in NeuroLibre before you click submit:

Hope this is all clear - let me know which directions you'd like me to go in for some of the points above. Maybe I could push @agahkarakuzu a bit more to get access to the production server during the review stage to ensure that, after publication, the notebook would run on their servers without crashing during the registration step and without timing out.

agahkarakuzu commented 7 months ago

@mathieuboudreau to clarify:

At the minimum, a DOI is generated & available much earlier than the final accept step from NeuroLibre

We can generate a DOI-formatted interactive preprint URL, but it will not be minted an official DOI before publishing it. Also to achieve this, we need to have the repo submitted and the REVIEW started (so that we have the corresponding issue ID).

jcohenadad commented 7 months ago

Is there a way to not have to duplicate the repos into a new repos? One scenario I anticipate is that, in 6 months, we find a bug on the original repos, we fix it on the original repos and we forget to fix it on the 2nd repos.

mathieuboudreau commented 7 months ago

Is there a way to not have to duplicate the repos into a new repos? One scenario I anticipate is that, in 6 months, we find a bug on the original repos, we fix it on the original repos and we forget to fix it on the 2nd repos.

Yes I think this is possible, I can either update this current repo to be neurolibre compatible (and move any colab-specific files/notebooks, though there may be a way to merge them together somehow), or simply make a specific branch and point neurolibre to that one. I was just using the new repo to do dev so that you could mute it if you got annoyed by the frequent commits and such (but wanted to keep an eye on this current one).

mathieuboudreau commented 7 months ago

@jcohenadad I've done some upates + converted the plots to a plotly figure (https://shimming-toolbox.github.io/rf-shimming-7t-neurolibre/), there's a few paths that I can take from here depending on your preference(s)

Path 1

Keep the overall structure of your Colab notebook (eg text is flow of technical info about the analysis), i.e. as it is now.

Path 2

Convert the NeuroLibre submission to a full "preprint" version of the manuscript, i.e. all the text in it but with the code for the figures enbedded as hidden cells in the HTML, i.e. like we did for our T1 mapping challenge manusript and cNeuromod manusript.

The disadvantage with taking this path, is that I'd have to wait for all the co-authors are done their changes to the manuscript before then adding the text/references to the NeuroLibre notebook & formatting the text (takes about a day). This path means a bit more of a delay before submitting to NeuroLibre (and to MRM), as you gave the co-authors until Friday to give their feedback.

Now for both paths, there's a few other decisions/limitations to consider:

Structure of repo notebook(s)

You mentioned that you'd like the notebook to live in this current repo, in case you want to update it/them in the future. Note that, the NeuroLibre publication is essentially an archive of the notebook/data/environment; even if you make changes here, the HTML and accompanying Jupyter Notebook that people would view will never get updated, regardless of what you change here.

So that brings my question, would you want to have two separate notebooks in this repo (one just for NeuroLibre that would likely not be changed later on, and one mostly for a Colab link here that you would always do changes to), or one that would be compatible with both. If the latter case, just a forwarning that, to make the notebook compatible and work smoothly with both the NeuroLibre submission pipeline and NeuroLibre Binderhub (ie., not have it execute the actual pipeline by default; only download the processed data and plot it. An optional flag to run the pipeline in Binderhub would be set), the notebook would not be as clean as your current Colab one (i.e. most cells would have flags that would change the behaviour/what is run depending on if it's in a plot-only mode, running in Binder, or running in Colab).

Note that, overall, Colab provides more resources, so it may be nice to have a notebook compatible with it "as a backup" in case the notebook hits a limit in Binder (though I'd really like it to run completely there; @agahkarakuzu have you got an ETA/idea of how I could test that during the submission?)

If you'd like to have a quick chat tomorrow to touch base on some of these questions let me know; tl;dr I can either do some final touches and submit NeuroLibre close to what it is right now likely by end-of-day tomorrow, or wait for the manuscript and make it look like a full preprint of the manuscript.

jcohenadad commented 7 months ago

Thank you so much for all your efforts @mathieuboudreau, the notebook as is now looks great. I think "path 1" makes more sense because:

But if we go with path 1, one can also wonder: what is the point of a neurolibre book if it's essentially a 'more cosmetic' version of the google colab? Some arguments

Tagging @nstikov @pbellec because this discussion is at the core of the user-who-wants-to-get-their-notebook experience.

mathieuboudreau commented 7 months ago

Thanks @jcohenadad !

Here are a few more listed benefits of using NeuroLibre:

There might be more, @agahkarakuzu knows the backend in and out, so he can comment best more likely.

pbellec commented 7 months ago

in brief: neurolibre tests the submission and archives everything needed to reproduce the work as proper academic records, for the long run. Collab does not offer any of that.

mathieuboudreau commented 7 months ago

Debugging an issue now related to a folder permissions error during the SCT installation when in a docker container. It worked for me everyday last week, but today started to fail. This would impact a NeuroLibre build, as they use repo2docker as well I believe.

Opened an issue on repo2docker: https://github.com/jupyterhub/repo2docker/issues/1334

And DM'd @joshuacwnewton (for now, will post to forum later if he deems relevant to SCT specifically) with the following log from inside a Docker session with a blank repo2docker Docker image:


mathieuboudreau@b363f05df303:~$ mkdir content
mathieuboudreau@b363f05df303:~$ cd content
mathieuboudreau@b363f05df303:~/content$ cd ..
mathieuboudreau@b363f05df303:~$ git clone https://github.com/spinalcordtoolbox/spinalcordtoolbox ~/content/sct

cd ~/content/sct
yes | ./install_sct
Cloning into '/home/mathieuboudreau/content/sct'...
remote: Enumerating objects: 60992, done.
remote: Counting objects: 100% (2211/2211), done.
remote: Compressing objects: 100% (1317/1317), done.
remote: Total 60992 (delta 1462), reused 1474 (delta 880), pack-reused 58781
Receiving objects: 100% (60992/60992), 119.15 MiB | 16.17 MiB/s, done.
Resolving deltas: 100% (35117/35117), done.

*******************************
* Welcome to SCT installation *
*******************************

Checking OS type and version...

Linux b363f05df303 6.6.12-linuxkit #1 SMP Fri Jan 19 08:53:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Checking requirements...

OK!

SCT version ......... 6.2.dev0
Installation type ... in-place
Operating system .... linux (unknown)
Shell config ........ /home/mathieuboudreau/.bashrc

SCT will be installed here: [/home/mathieuboudreau/content/sct]

Do you agree? [y]es/[n]o: 
Skipping copy of source files (source and destination folders are the same)

Do you want to add the sct_* scripts to your PATH environment? [y]es/[n]o: 
Downloading Miniconda...

wget -nv -O /tmp/tmp.j49W08kc15/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

2024-02-07 19:15:28 URL:https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh [141613749/141613749] -> "/tmp/tmp.j49W08kc15/miniconda.sh" [1]

Installing Miniconda...

bash /tmp/tmp.j49W08kc15/miniconda.sh -p /home/mathieuboudreau/content/sct/python -b -f

PREFIX=/home/mathieuboudreau/content/sct/python
Unpacking payload ...

Installing base environment...

Downloading and Extracting Packages: ...working... done

Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Executing transaction: ...working... done
installation finished.

Creating conda environment...

CondaError: Error encountered while attempting to create cache directory.
  Directory: /home/mathieuboudreau/.cache/conda/notices
  Exception: [Errno 13] Permission denied: '/home/mathieuboudreau/.cache/conda'

Installation failed!

Please find the file "/home/mathieuboudreau/content/sct/install_sct_log.txt",
then upload it as a .txt attachment in a new topic on SCT's forum:
--> http://forum.spinalcordmri.org/c/sct

mathieuboudreau@b363f05df303:~/content/sct$ 
pbellec commented 7 months ago

in brief: neurolibre tests the submission and archives everything needed to reproduce the work as proper academic records, for the long run. Collab does not offer any of that.

Trying to refine the argument. TL,DR google collab notebooks break relatively fast, and NeuroLibre preprint do not.

There is a very fast decay of google collab notebooks. Check this example from a tutorial with cneuromod data which I believe was set two years ago: https://colab.research.google.com/drive/10aKI0NcSqWbwxOgvBrcv6xk-LXhKNM2u?usp=sharing#scrollTo=iL3KlwjxgOOq It won't run because it uses data hosted on google drive, and the data is no longer available.

As Mathieu already pointed out, this decay will also happen with dependencies. I would be very surprised that a google collab environment runs after 3-5 years (or much less than that really). A google search for "broken dependencies google collab" gives over 100k hits (sample). Fun fact: ubuntu does retire their channels. Controlling versions in a docker build file does not mean this environment will build in a few years. The only way to make a work reproducible is to save the binary environment along with the code.

I would not be surprised if neurolibre preprints still run in a couple decades, provided we manage to maintain the platform. I am saying this because we archive binary container images, and that given the very large number of container binaries out there it is almost certain that archivists will develop solutions for long term support of these images.

mathieuboudreau commented 7 months ago

Debugging an issue now related to a folder permissions error during the SCT installation when in a docker container. It worked for me everyday last week, but today started to fail. This would impact a NeuroLibre build, as they use repo2docker as well I believe.

Issue is now fixed, see this comment: https://github.com/jupyterhub/repo2docker/issues/1334#issuecomment-1933233055

jcohenadad commented 7 months ago

@mathieuboudreau just checking what is the timeline for the NeuroLibre submission? Thanks

mathieuboudreau commented 7 months ago

@jcohenadad I mad a PR last week (#91) and tagged/pinged you for review. I’m waiting for the these changes to be merged before submitting to NeuroLibre, as you requested in this thread that the submission be hosted in this repo and not in the separate repo I had made. So the ETA is as soon as you approve those changed/content and it gets merged master, I’ll submit (which takes just minutes).

mathieuboudreau commented 7 months ago

Just currently doing a check the the script/setup reproduces on Colab now as a last-minute sanity check (already found and fixed one minor bug), and that the repo2docker setup also works. Will submit to NeuroLibre ASAP after this check

mathieuboudreau commented 7 months ago

Submitted to NeuroLibre,

Screenshot 2024-02-14 at 4 07 22 PM

There is not a public link yet, this is from my dashboard when I'm logged in. Once @agahkarakuzu or @pbellec accept it for review, there will be a GitHub issue opened on the NeuroLibre GitHub and I'll post the link here.

jcohenadad commented 7 months ago

amazing! is that ok to submit to MRM at this point?

mathieuboudreau commented 7 months ago

If we can wait a short amount of time so that they can trigger the start of the process, then I believe it would generate a DOI that could add to the manuscript (even though the neurolibre review process hasn’t been done/completed).

@agahkarakuzu or @pbellec, could you do this ASAP?

agahkarakuzu commented 7 months ago

@mathieuboudreau the DOI link (after publication) will be:

https://doi.org/10.55458/neurolibre.00025

Reproducible preprint will be served at:

https://preprint.neurolibre.org/10.55458/neurolibre.00025

I've seen your notes on the submission form regarding two options, (2hr run vs quick run), we'll test and see how it goes.

@mathieuboudreau if you give me write access to the repo, I can quickly push fixes needed, or I can send PRs, whichever is more convenient for you.

mathieuboudreau commented 7 months ago

@mathieuboudreau if you give me write access to the repo, I can quickly push fixes needed, or I can send PRs, whichever is more convenient for you.

Done.

mathieuboudreau commented 7 months ago

amazing! is that ok to submit to MRM at this point?

@jcohenadad the submission has passed the pre-review stage (https://github.com/neurolibre/neurolibre-reviews/issues/24) and is currently under review (https://github.com/neurolibre/neurolibre-reviews/issues/25).

I've updated the details Agah shared (doi and link) in the Data Availability Statement; they aren't currently active but I've indicated in the manuscript that it has been submitted and is under review along with those links.

mathieuboudreau commented 7 months ago

@jcohenadad now that you've decided to keep the old MPL-based figure in the manuscript instead of the Plotly one, would you like me to re-integrate the MPL code in the notebook that generates that image? The lines of code that were removed are in https://github.com/shimming-toolbox/rf-shimming-7t/commit/0d0516096365e6c2f97ae8afd82d11cbed301648 as your comment https://github.com/shimming-toolbox/rf-shimming-7t/pull/91#pullrequestreview-1878413109 suggested to switch to the plotly one for the manuscript (and thus, the MPL code wasn't needed anywhere anymore).

jcohenadad commented 7 months ago

i think there has been some misunderstanding https://github.com/shimming-toolbox/rf-shimming-7t/commit/0d0516096365e6c2f97ae8afd82d11cbed301648#commitcomment-138731916