khanlab / snakebids

Snakemake + BIDS
https://snakebids.readthedocs.io
MIT License
17 stars 13 forks source link

add an easy way for user to add a README and dataset_description to their output #221

Open Remi-Gau opened 1 year ago

Remi-Gau commented 1 year ago

as the dataset_description and the README.md are required by the specification

https://bids-specification.readthedocs.io/en/latest/modality-agnostic-files.html

I think there should be a way to automatically add them in the output dataset, possibly by using some of the config to populate them.

Another good plus would be to have an easy way to generate a method section + list of references a la fmriprep, possibly by relying on jinja templates or similar for more flexibility.

Remi-Gau commented 1 year ago

related to #154

pvandyken commented 1 year ago

154 stalled because it wasn't clear what's the best way to approach this. Thinking about it now, my preferred approach would be to make dataset_description.json files using a rule within the snakemake folder:

# something like this

rule generate_dataset_description:
  input: inputs['dataset'].path
  output: output / "dataset_description.json"
  run:
    # modify the description file accordingly

Currently, my inputs['dataset'].path syntax wouldn't work because of #216, but once that's resolved, you'd achieve it by defining the dataset component within pybids_inputs. Alternatively, once #223 is implemented, you could access the dataset_description via the pybids layout.

I like the idea of templating, but I think it's ultimately beyond the scope of snakebids. I'd rather spend time trying to make snakebids play well with 3rd party tools that achieve these tasks.

@tkkuehn, not sure if you have thoughts on this?

Remi-Gau commented 1 year ago

so more something that should go in the cookiecutter: like a rule that people can opt in ?

akhanf commented 1 year ago

I haven't tried it out, but there is now built-in jinja templating with snakemake:

https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#template-rendering-integration

So the rule-based option would be pretty trivial. I agree having this in the cookiecutter and/or example-snakebids app would make sense.

kaitj commented 1 year ago

Have a very basic implementation for this now using cookiecutter and the template integration. Some of the implementation details should be discussed. Briefly:

  1. It uses cookiecutter to set up the snakebids app with opt in for creating dataset metadata
  2. If the user opts in, it creates the necessary rules and jinja templates for a barebones README.md and dataset_description.json file, where the file context can be updated via the rule params. Where it can, it will pull data from the defined cookiecutter variables.
kaitj commented 1 year ago

During the dev lunch, we discussed moving this away from the cookiecutter / snakemake integration and generating these files with pure Python and as a Snakebids plugin that is run (potentially as a post-workflow hook).