marcos-diazg / musica

MuSiCa - Mutational Signatures in Cancer
https://www.clinicbarcelona.org/en/idibaps/research-areas/liver-digestive-system-and-metabolism/genetic-predisposition-to-gastrointestinal-cancer/tools
MIT License
23 stars 13 forks source link

Mandatory column names #32

Closed zhiiiyang closed 6 years ago

zhiiiyang commented 6 years ago

Hi, I understand that we have to include four mandatory columns including CHROM, POS, REF, and ALT. I wonder where to put the sample names. I tried for a couple of times and could not get it to work. Thank you.

marcos-diazg commented 6 years ago

Hello!

Thank you so much for using our application! Regarding your question, you can add whatever column you like after CHROM, POS, REF and ALT columns. However, it is not possible to include the mutations of more than one sample in the same file. As it is specified in the help for input files, multi-sample uploading is allowed but with one sample per file in the case of TSV and Excel formats. To use one single multi-sample file you must use MAF format, following TCGA instructions.

zhiiiyang commented 6 years ago

Cool. That makes sense. I will give it a try and keep you posted. Thank you so much!

zhiiiyang commented 6 years ago

It works when only uploading one sample at a time. I have another about the algorithm running behind it. When the non-negative least-squares optimization is employed to find the signature weight/contribution, did you run the optimization on each sample or in a combined framework when there are multiple samples present?

marcos-diazg commented 6 years ago

Hello again!

If you use CHROM, POS, REF and ALT columns both in TSV and Excel formats the application allows multi-sample uploading to evaluate more thant one sample at a time. However, the least-squares optimization is run for each sample independently.

Thanks for your feedback!

zhiiiyang commented 6 years ago

I think that it helps to answer all my questions. Thank you so much again!