imgag / megSAP

a Medical Genetics Sequence Analysis Pipeline
GNU General Public License v3.0
70 stars 20 forks source link

Incorrect error when trying to analyze a file without Read Group #304

Closed caspargross closed 2 weeks ago

caspargross commented 2 weeks ago

Bug occurs in the function to extract Read Groups from Bamfile. This is used in longread analysis before minimap2.

https://github.com/imgag/megSAP/blob/358d1de90507f173d1a4b262b3db4dcf0cf9625e/src/Common/genomics.php#L2216-L2242

I tried to analyse a file without ReadGroup. Instead of the expected error in line 2239 ("No Readgroup description found") i got the following error in the log file:

2024-10-15T16:10:03.3521    kbQz    Parameter: out                  = analysis/megsap/HG004_BER_01.HAC.GRCh38/HG004-BER-01-HAC-GRCh38.bam
2024-10-15T16:10:03.4037    kbQz    ERROR: 'Error while executing command: '/mnt/storage2/megSAP/tools/samtools-1.20/samtools view -H analysis/megsap/HG004_BER_01.HAC.GRCh38/HG004-BER-01-HAC-GRCh38_01.mod.unmapped.bam | egrep '^@RG' '
CODE: 1
STDOUT: 
STDERR: 
' in /mnt/storage2/megSAP/pipeline/src/Common/functions.php:73.
2024-10-15T16:10:03.4068    kbQz    END mapping_minimap
2024-10-15T16:10:03.4137    kbQz    Execution time of 'mapping_minimap': 0.3470s
2024-10-15T16:10:03.4244    jUF4    Stderr of 'php /mnt/storage2/megSAP/pipeline//src/NGS/mapping_minimap.php':
2024-10-15T16:10:03.4244    jUF4        ERROR: 'Error while executing command: '/mnt/storage2/megSAP/tools/samtools-1.20/samtools view -H analysis/megsap/HG004_BER_01.HAC.GRCh38/HG004-BER-01-HAC-GRCh38_01.mod.unmapped.bam | egrep '^@RG' '
2024-10-15T16:10:03.4244    jUF4        CODE: 1
2024-10-15T16:10:03.4244    jUF4        STDOUT: 
2024-10-15T16:10:03.4244    jUF4        STDERR: 
2024-10-15T16:10:03.4244    jUF4        ' in /mnt/storage2/megSAP/pipeline/src/Common/functions.php:73.
2024-10-15T16:10:03.4283    jUF4    ERROR: 'Call of external tool 'php /mnt/storage2/megSAP/pipeline//src/NGS/mapping_minimap.php' returned error code '255'.' in /mnt/storage2/megSAP/pipeline/src/Common/ToolBase.php:1138.
2024-10-15T16:10:03.4826    jUF4    END analyze_longread
2024-10-15T16:10:03.4865    jUF4    Execution time of 'analyze_longread': 18m 1s
leonschuetz commented 2 weeks ago

Can you check the following branch? https://github.com/imgag/megSAP/tree/empty_ReadGroup

caspargross commented 2 weeks ago

Thanks! Looks good now.

2024-10-18T09:15:59.3983    8Fu0    Parameter: out                  = analysis/megsap/HG001_BER_01.SUP.GRCh38/HG001-BER-01-SUP-GRCh38.bam
2024-10-18T09:15:59.4784    8Fu0    WARNING: 'WARNING: No ReadGroup description found!' in /mnt/storage2/users/ahgrosc1/dev/megsap/src/Common/genomics.php:2239.
2024-10-18T09:15:59.6431    8Fu0    Calling 'mapping' pipeline
2024-10-18T09:15:59.6431    8Fu0        command 1  = 
2024-10-18T09:15:59.6431    8Fu0        version    = n/a
2024-10-18T09:15:59.6431    8Fu0        parameters = (/mnt/storage2/megSAP/tools/samtools-1.20/samtools fastq -o /dev/null  -TMM,ML  analysis/megsap/HG001_BER_01.SUP.GRCh38/HG001-BER-01-SUP-GRCh38_01.mod.unmapped.bam)
2024-10-18T09:15:59.6431    8Fu0        command 2  = /mnt/storage2/megSAP/tools/minimap2-2.28_x64-linux/minimap2
2024-10-18T09:15:59.6431    8Fu0        version    = 2.28-r1209
2024-10-18T09:15:59.6431    8Fu0        parameters = -a --MD -x map-ont --eqx -t 32 -R '@RG\tID:HG001-BER-01-SUP-GRCh38\tSM:HG001-BER-01-SUP-GRCh38\tLB:HG001-BER-01-SUP-GRCh38\tCN:medical_genetics_tuebingen\tDT:2024-10-18T09:15:59+02:00\tPL:ONT' /tmp/local_ngs_data_GRCh38//GRCh38.fa -y - 
2024-10-18T09:15:59.6431    8Fu0        command 3  = /mnt/storage2/megSAP/tools/samtools-1.20/samtools
2024-10-18T09:15:59.6431    8Fu0        version    = 1.20
2024-10-18T09:15:59.6431    8Fu0        parameters = sort -T /tmp/megSAP_user_ahgrosc1/mapping_minimap_MDcRa9 -m 1G -@ 4 -o /tmp/megSAP_user_ahgrosc1/HG001-BER-01-SUP-GRCh38mVeJ49.bam -
2024-10-18T09:15:59.6431    8Fu0        pipeline   =  (/mnt/storage2/megSAP/tools/samtools-1.20/samtools fastq -o /dev/null  -TMM,ML  analysis/megsap/HG001_BER_01.SUP.GRCh38/HG001-BER-01-SUP-GRCh38_01.mod.unmapped.bam) 2>/tmp/megSAP_user_ahgrosc1/mapping_minimap_Yus1i8.stderr | /mnt/storage2/megSAP/tools/minimap2-2.28_x64-linux/minimap2 -a --MD -x map-ont --eqx -t 32 -R '@RG\tID:HG001-BER-01-SUP-GRCh38\tSM:HG001-BER-01-SUP-GRCh38\tLB:HG001-BER-01-SUP-GRCh38\tCN:medical_genetics_tuebingen\tDT:2024-10-18T09:15:59+02:00\tPL:ONT' /tmp/local_ngs_data_GRCh38//GRCh38.fa -y -  2>/tmp/megSAP_user_ahgrosc1/mapping_minimap_qmWLBa.stderr | /mnt/storage2/megSAP/tools/samtools-1.20/samtools sort -T /tmp/megSAP_user_ahgrosc1/mapping_minimap_MDcRa9 -m 1G -@ 4 -o /tmp/megSAP_user_ahgrosc1/HG001-BER-01-SUP-GRCh38mVeJ49.bam - 2>/tmp/megSAP_user_ahgrosc1/mapping_minimap_YpsTT8.stderr