generating a report for DNA and RNA sample from different runs - does not work properly when the DNA seq run is older than the RNA seq run

InPreD / PRONTO

rePort geneRator fOr iNpred Tumor bOards

GNU General Public License v3.0

0 stars 4 forks source link

generating a report for DNA and RNA sample from different runs - does not work properly when the DNA seq run is older than the RNA seq run #29

Closed tinavisnovska closed 5 months ago

tinavisnovska commented 7 months ago

I have a DNA sample from an older sequencing run and an RNA sample from the more recent sequencing run - both samples associated with one patient and I want to generate a report for the two patient samples.

When TSOPPI postprocessing is performed, all the postprocessing data related to the DNA sample gets stored in the postprocessing folder of the sequencing run, in which the RNA sample was sequenced.

PRONTO cannot locate the small variant table of the DNA sample in runID_DNA because the table is not stored in runID_DNA (where PRONTO looks for it) but in runID_RNA.

It seems to me that the issue is related to the code around line 1030, where the value of data_file_small_variant_table is defined.

tinavisnovska commented 7 months ago

Temporary fix for this is that the runID_RNA is provided also for the DNA sample as runID_DNA in the PRONTO meta file.

xiaoliz0 commented 7 months ago

Yes, PRONTO finds the RNA sample based on the results of post-processing. You could also manually filled it into the meta file.

tinavisnovska commented 6 months ago

Hi again, there is no issue with RNA sample in this case. However, for the DNA sample, one needs to provide id of the sequencing run in which RNA sample was sequenced (not the DNA sample). This is rather confusing.

An issue here is, I believe, that when I run TSOPPI, I can provide up to three different sequencing run IDs for the three samples: tumor DNA, normal DNA and tumor RNA. In PRONTO, however, I should for each of the samples provide the run ID in which the TSOPPI results are stored and that run ID can differ from the run in which the particular sample was sequenced. I think it would help to have this better explained in the header of the metadata file.

xiaoliz0 commented 6 months ago

Emm, I see your point. PRONTO reads the RNA information (the RNA sample run directory) directly from the sample_list.tsv file in the TSOPPI results folder of DNA samples. And get the RNA images based on the run directory wrote there. The RNA information in meta file will not effect the reports. If you want to combine different RNA sample path than the one generated in the TSOPPI results, some other function is needed for PRONTO. You could bring this topic to the next bioinformatics meeting I think. :)

tinavisnovska commented 6 months ago

hm, I find it a bit difficult to keep a working model of where from PRONTO takes which information in my mind, it would help to have it maybe somehow visualized or at least written down. It is especially tricky with the implicit input files (as for example sample_list.tsv or variant_summary.tsv) that are not listed as an input but used in such a way. Yes, we definitely can discuss it more on one of the bioinformatics meetings!