MikkelSchubert / paleomix

Pipelines and tools for the processing of ancient and modern HTS data.
https://paleomix.readthedocs.io/en/stable/
MIT License
43 stars 19 forks source link

"Positional data is too large for BAM format" #25

Closed lindenb closed 4 years ago

lindenb commented 4 years ago

Hi all,

I'm currently try to align a set of fastqs on Grch37.

$ samtools --version
samtools 1.9
Using htslib 1.9
Copyright (C) 2018 Genome Research Ltd.

Some steps are failing with the following message in the log file.

Reading SAM file from STDIN ...
Joinining subprocesses:
[E::bam_write1] Positional data is too large for BAM format
samtools fixmate: Couldn't write to output file: No such file or directory
[E::bgzf_flush] File write failed (wrong size)
[E::bgzf_close] File write failed
Traceback (most recent call last):
  File "/tmp/pip-unpacked-wheel-VmAfDD/paleomix/main.py", line 242, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/tmp/pip-unpacked-wheel-VmAfDD/paleomix/main.py", line 235, in main
    return module.main(argv[1:])
  File "/sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/tools/cleanup.py", line 388, in main
    return _pipe_to_bam()
  File "/sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/tools/cleanup.py", line 93, in _pipe_to_bam
    output_handle.write(record)
  File "pysam/libcalignmentfile.pyx", line 1704, in pysam.libcalignmentfile.AlignmentFile.write
  File "pysam/libcalignmentfile.pyx", line 1736, in pysam.libcalignmentfile.AlignmentFile.write
IOError: sam_write1 failed with error code -1
  - Command finished: /sandbox/users/lindenbaum-p/packages/anaconda3/envs/PIP/bin/python /sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/main.pyc cleanup --fasta /sandbox/resources/species/human/cng.fr/hs37d5/hs37d5_all_chr.fasta --temp-prefix /sandbox/shares/u1087/lindenb/work/20200204.paleomix/tmp/c2e3b794-df63-4621-acd6-c932118126e4/bam_cleanup --min-quality 25 --exclude-flags 0x4 --samtools1x yes --rg-id ACTGGAC --rg SM:B00IK32 --rg LB:ACTGGAC --rg PU:Lane_1 --rg PL:ILLUMINA --rg PG:bwa pipe
    - Return-code:    1
  - Command finished: samtools sort -l 0 -O bam -T /sandbox/shares/u1087/lindenb/work/20200204.paleomix/tmp/c2e3b794-df63-4621-acd6-c932118126e4/bam_cleanup
    - Return-code:    0
  - Command finished: samtools calmd -b - /sandbox/resources/species/human/cng.fr/hs37d5/hs37d5_all_chr.fasta
    - Return-code:    0
  - Command finished: /sandbox/users/lindenbaum-p/packages/anaconda3/envs/PIP/bin/python /sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/main.pyc cleanup --fasta /sandbox/resources/species/human/cng.fr/hs37d5/hs37d5_all_chr.fasta --temp-prefix /sandbox/shares/u1087/lindenb/work/20200204.paleomix/tmp/c2e3b794-df63-4621-acd6-c932118126e4/bam_cleanup --min-quality 25 --exclude-flags 0x4 --samtools1x yes --rg-id ACTGGAC --rg SM:B00IK32 --rg LB:ACTGGAC --rg PU:Lane_1 --rg PL:ILLUMINA --rg PG:bwa cleanup
    - Return-code:    0
  - Command finished: samtools fixmate -O bam - -
    - Return-code:    1
Errors occured during processing!

can you help me please.

MikkelSchubert commented 4 years ago

Hi Pierre,

What version of PALEOMIX and BWA are you using, and with what alignment algorithm and parameters for BWA? And are the FASTQ files you are attempting to map publicly available?

Best regards, Mikkel

On Thu, Feb 6, 2020 at 11:02 AM Pierre Lindenbaum notifications@github.com wrote:

Hi all,

I'm currently try to align a set of fastqs on Grch37.

$ samtools --version samtools 1.9 Using htslib 1.9 Copyright (C) 2018 Genome Research Ltd.

Some steps are failing with the following message in the log file.

Reading SAM file from STDIN ... Joinining subprocesses: [E::bam_write1] Positional data is too large for BAM format samtools fixmate: Couldn't write to output file: No such file or directory [E::bgzf_flush] File write failed (wrong size) [E::bgzf_close] File write failed Traceback (most recent call last): File "/tmp/pip-unpacked-wheel-VmAfDD/paleomix/main.py", line 242, in sys.exit(main(sys.argv[1:])) File "/tmp/pip-unpacked-wheel-VmAfDD/paleomix/main.py", line 235, in main return module.main(argv[1:]) File "/sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/tools/cleanup.py", line 388, in main return _pipe_to_bam() File "/sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/tools/cleanup.py", line 93, in _pipe_to_bam output_handle.write(record) File "pysam/libcalignmentfile.pyx", line 1704, in pysam.libcalignmentfile.AlignmentFile.write File "pysam/libcalignmentfile.pyx", line 1736, in pysam.libcalignmentfile.AlignmentFile.write IOError: sam_write1 failed with error code -1

  • Command finished: /sandbox/users/lindenbaum-p/packages/anaconda3/envs/PIP/bin/python /sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/main.pyc cleanup --fasta /sandbox/resources/species/human/cng.fr/hs37d5/hs37d5_all_chr.fasta --temp-prefix /sandbox/shares/u1087/lindenb/work/20200204.paleomix/tmp/c2e3b794-df63-4621-acd6-c932118126e4/bam_cleanup --min-quality 25 --exclude-flags 0x4 --samtools1x yes --rg-id ACTGGAC --rg SM:B00IK32 --rg LB:ACTGGAC --rg PU:Lane_1 --rg PL:ILLUMINA --rg PG:bwa pipe
    • Return-code: 1
  • Command finished: samtools sort -l 0 -O bam -T /sandbox/shares/u1087/lindenb/work/20200204.paleomix/tmp/c2e3b794-df63-4621-acd6-c932118126e4/bam_cleanup
    • Return-code: 0
  • Command finished: samtools calmd -b - /sandbox/resources/species/human/cng.fr/hs37d5/hs37d5_all_chr.fasta
    • Return-code: 0
  • Command finished: /sandbox/users/lindenbaum-p/packages/anaconda3/envs/PIP/bin/python /sandbox/users/lindenbaum-p/.local/lib/python2.7/site-packages/paleomix/main.pyc cleanup --fasta /sandbox/resources/species/human/cng.fr/hs37d5/hs37d5_all_chr.fasta --temp-prefix /sandbox/shares/u1087/lindenb/work/20200204.paleomix/tmp/c2e3b794-df63-4621-acd6-c932118126e4/bam_cleanup --min-quality 25 --exclude-flags 0x4 --samtools1x yes --rg-id ACTGGAC --rg SM:B00IK32 --rg LB:ACTGGAC --rg PU:Lane_1 --rg PL:ILLUMINA --rg PG:bwa cleanup
    • Return-code: 0
  • Command finished: samtools fixmate -O bam - -
    • Return-code: 1 Errors occured during processing!

can you help me please.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MikkelSchubert/paleomix/issues/25?email_source=notifications&email_token=AASMY25HGYMTZEZ763NGWWDRBPN3LA5CNFSM4KQZ4U6KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ILOWKUQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASMY27FENBXSI4W6QZQ7PTRBPN3LANCNFSM4KQZ4U6A .

lindenb commented 4 years ago

@MikkelSchubert thank you for your response.

I've compiled a fresh version of samtools (see below) for now, the workflows runs without errors. I'll keep you informed. (67 done of 539 tasks)

paleomix with bwa:

paleomix bam_pipeline run --destination $(OUTDIR) --jar-root=JAR_ROOT --temp-root=$(OUTDIR)/tmp --max-threads=10 template.yaml

lindenb commented 4 years ago

oh, and the fastqs are not public.

lindenb commented 4 years ago

Ok, that worked after I updated samtools:

  Number of nodes:             539
  Number of done nodes:        539
  Number of runable nodes:     0
  Number of queued nodes:      0
  Number of outdated nodes:    0
  Number of failed nodes:      0
  Pipeline runtime:            21:56:44s

My apologies for the disturbance.