DiltheyLab / HLA-LA

Fast HLA type inference from whole-genome data
GNU General Public License v3.0
120 stars 40 forks source link

Cram files - [main_samview] fail to read the header from "-". #63

Closed juliandwillett closed 2 years ago

juliandwillett commented 2 years ago

I encountered an error when trying to process cram files. The error was thrown when it was trying to organize unmapped reads. After some help on Biostars, I found a solution:

  1. Duplicate the HLA-LA.pl file. I renamed it to HLA-LA_CRAM.pl
  2. Replace line 415 with the following: my $extraction_command_unmapped = qq($samtools_bin view -H $BAM > $working_dir_thisSample/header.file ; $samtools_bin view -\@ $threads_minus_1 $view_T_switch $BAM '*' | awk '{if (\$3 == "*") print \$0}' > $working_dir_thisSample/filtered.sam ; cat $working_dir_thisSample/header.file $working_dir_thisSample/filtered.sam | $samtools_bin view -bo $target_extraction_unmapped -);

The gist of what turned out to be the issue was that the header information was being removed by awk. I dealt with this issue by first saving the header information, running the awk command, add the header back onto the piped file, then continue with the pipeline. This should not affect the accuracy of the program.