Open sellth opened 1 year ago
Another thing (I almost forgot):
This only seems to the address the variable names, however many templates also use Sample Name
as the first column in the s-file instead of Source Name
, so maybe this should/needs to be changed as well? Otherwise the variable renaming makes little sense.
If we do change this I'm not sure what the effects on pipelines that might depend in these templates will be.
Thanks Nicolai for looking into this.
- for one thing, the stem cell core templates actually ask for sample names, since we use the cellline names as source
That was indeed an oversight in the stem_cell_core_sc template which is fixed now. I left _bulk unchanged because of this.
- From what I've seen so far many templates use the same name for sample & source. […] For most experimental people (or just people not accustomed to ISA) sample name is much more intuitive description than source name in these cases.
Most templates derive their Source Names from the Sample Names, but I would agree with Mikko that this is a bit confusing in the context of ISA-tabs and also experimentally. I would expect Sample Names to be derived from the Source Names plus a suffix (optionally). That is how I defined it in for the MC template, there is source_names
and sample_suffix
in the cookiecutter.json.
This only seems to the address the variable names, however many templates also use Sample Name as the first column in the s-file instead of Source Name, so maybe this should/needs to be changed as well? Otherwise the variable renaming makes little sense.
Not sure what you mean by this. s_ files need to start with a Source Name column to be standard compliant and all do so right now.
Most templates derive their Source Names from the Sample Names, but I would agree with Mikko that this is a bit confusing in the context of ISA-tabs and also experimentally. I would expect Sample Names to be derived from the Source Names plus a suffix (optionally). That is how I defined it in for the MC template, there is
source_names
andsample_suffix
in the cookiecutter.json.
The templates might do indeed do this, but I would argue that most users generally do not, since they only come up with source names when they start entering things into sodar (they will always have some sort of sample name ready). Maybe the more important questions to answer for is: who will use these templates or rather who do we want to use them? For larger projects (with inevitably closer cubi collaboration), someone will probably figure out a good way to organise and derive sample and source names. But smaller projects that - maybe one day? - some can just create & fill the samplehseet from within sodar this is not the case, and these people likely will come with a list of samples and names, but not source names.
Not sure what you mean by this. s_ files need to start with a Source Name column to be standard compliant and all do so right now.
Ah you're right I must have confused some (older?) things here or maybe I just remebered the start of some a-files ...
As this is not really urgent, let's not do anything hastily and talk once I'm back in Berlin.
fixes: bihealth/cubi-tk#106