caravagnalab / rRACES

R wrapper for the RACES package
GNU General Public License v3.0
2 stars 1 forks source link

Sam validation error for paired reads #95

Closed giorgiagandolfi closed 8 months ago

giorgiagandolfi commented 8 months ago

I tried to simulate the sequencing of paired-end reads in the following way:

seq_results_normal <- simulate_normal_seq(phylo_forest, coverage = 20,write_SAM = TRUE, output_dir = "rRACES_Normal_20X_Paired",insert_size=350)
seq_results_tumor <- simulate_seq(phylo_forest, coverage = 40, with_normal_sample= FALSE, insert_size=350,
                                  purity = 1,write_SAM = TRUE,output_dir = "rRACES_Tumor_40X_1_Paired")

by setting insert_size=350 according to the information found in Illumina website. To test the correctness of sam files, I used picard ValidateSamFile and I got the following error for all the reads:

ERROR::INVALID_REFERENCE_INDEX:Record 54, Read name r00000000026/2, Mate Reference sequence not found in sequence dictionary.

A similar error raised when trying to convert sam to bam file:

samtools view -bS rRACES_Normal_20X_Paired/chr_22.sam > rRACES_Normal_20X_Paired/chr_22.bam
[W::sam_parse1] unrecognized mate reference name "r00000053053/2"; treated as unmapped

I also simulated single reads without specifying the insert_size and the sam validation did not give me any error.

albertocasagrande commented 8 months ago

Closed by commit b86e79eeee975a276707cc6e4a20caf6c2758f5c.