BushmanLab / intSiteCaller

GNU General Public License v3.0
5 stars 5 forks source link

Register metadata with Bushman Registry #49

Open aubreybailey opened 9 years ago

aubreybailey commented 9 years ago

Please register samples with the lab registry at completion. I realize that this isn't a generic feature of this script but here seems like a convenient place to call it at the end.

/home/aubreybailey/parse_samplesheet.py /media/THING1/Illumina/1507..../Data/Intensities/BaseCalls/Undetermined_S0_L001_R1_001.fastq.gz to get the suggested registration line like: register_run.py --file=$PWD/Undetermined_S0_L001_R1_001.fastq.gz then register_samples.py --insprid -r ### -s sampleInfo.tsv

anatolydryga commented 9 years ago

sound good but not concern of the caller, maybe belongs to uploader(?) registry is not used in the pipeline for now ... should the dependency be introduced?

kylebittinger commented 9 years ago

Whether or not it's in the pipeline, you and Yinghua should incorporate it into your SOP to register the sample metadata. For keeping track of samples long-term, it has saved us many times.

If incorporated into the pipeline, I think it should be included in a part that you expect to run for every dataset. If you expect to process but not upload some runs, then it seems to me that it will do more harm than good to put it in the uploader step.

aubreybailey commented 9 years ago

clearly belongs as something separate, and not as a dependent part. Especially as we expect to publish this separate of the registry. The sampleInfo.tsv described in the readme is not part of the default illumina output, so wherever creates this file seems like it should naturally precede the registry scripts

On Thu, Aug 13, 2015 at 3:32 PM, Kyle Bittinger notifications@github.com wrote:

Whether or not it's in the pipeline, you and Yinghua should incorporate it into your SOP to register the sample metadata. For keeping track of samples long-term, it has saved us many times.

If incorporated into the pipeline, I think it should be included in a part that you expect to run for every dataset. If you expect to process but not upload some runs, then it seems to me that it will do more harm than good to put it in the uploader step.

— Reply to this email directly or view it on GitHub https://github.com/BushmanLab/intSiteCaller/issues/49#issuecomment-130813279 .