NationalGenomicsInfrastructure / piper

A genomics pipeline build on top of the GATK Queue framework
9 stars 9 forks source link

fc_id naming convention and stckh2UUSNPSEQ #26

Closed vezzi closed 9 years ago

vezzi commented 9 years ago

Hej Johan, we are having some troubles deciding the fc_id

Up to now we have shorten the original fc name 140528_D00415_0049_BC423WACXX to the following one 140528_BC423WACXX

I think that this is exactly what piper assumes as it takes the fc_name BC423WACXX to name the bam files in a unique way (other than the sample name)

In order to uniform our directory structure to the db we would like to sue the entire fc name 140528_D00415_0049_BC423WACXX

I noticed that stckh2UUSNP is affected by this as repot.tsv says that the fc_id is D00415

#SampleName     Lane    ReadLibrary     FlowcellId
P1142_101       1       A       D00415

Have you some strong opinion towards using the reduced fc_name ( 140528BC423WACXX)? Can you change stckh2UUSNPSEQ in order to select the correct field. I assume that now you split using "" as separator and use the second value, can you do the same but use the last value, so that both the long and the short name will work?

johandahlberg commented 9 years ago

You're right about how I've done it. I'll push a fix to get the last value instead of the second one as you suggest, this afternoon.