Open Fer020707 opened 2 years ago
Hi,
These are errors produced by samtools when trying to extract read counts information. There might be an issue with your BAM file(s):
[W::bam_hdr_read] EOF marker is absent. The input is probably truncated
Maybe try to do a samtools quickcheck
? (https://www.htslib.org/doc/samtools-quickcheck.html)
Can you also check this particular .bai
is present:
[mpileup] fail to load index for BAM/ISEM114.recal.reads.bam
Thank you very much for your reply. Download the bam files again, and run the program again. Apparently the previous problem was solved, but it shows me another error when collecting the results of the vcf
Error message:
executor > local (4)
[f5/bc2a25] process > bed [100%] 1 of 1 ✔
[45/b0edda] process > split_bed [100%] 1 of 1 ✔
[7b/5fbeff] process > mpileup2vcf (chr1_17345370-chrX_14883636... [100%] 1 of 1 ✔
[84/dfbf6c] process > collect_vcf_result [ 0%] 0 of 1
Error executing process > 'collect_vcf_result'
Caused by:
Process `collect_vcf_result` terminated with an error exit status (1)
Command executed:
mkdir VCF
mv *.vcf VCF
# Extract the header from the first VCF
sed '/^#CHROM/q' VCF/chr1_17345370-chrX_14883636_regions.vcf > header.txt
# Add contigs in the VCF header
cat ucsc.hg19.fasta.fai | cut -f1,2 | sed -e 's/^/##contig=<ID=/' -e 's/[ ][ ]*/,length=/' -e 's/$/>/' > contigs.txt
...
Command executed:
mkdir VCF
mv *.vcf VCF
# Extract the header from the first VCF
sed '/^#CHROM/q' VCF/chr1_17345370-chrX_14883636_regions.vcf > header.txt
# Add contigs in the VCF header
cat ucsc.hg19.fasta.fai | cut -f1,2 | sed -e 's/^/##contig=<ID=/' -e 's/[ ][ ]*/,length=/' -e 's/$/>/' > contigs.txt
sed -i '/##reference=.*/ r contigs.txt' header.txt
# Add version numbers in the VCF header
echo '##command=nextflow run iarcbioinfo/needlestack -with-docker --bed bed/SOPHIA_targetregions_hg19_ORDERED.bed --input_bams bam/SOPHIA/ --ref ref/ucsc.hg19.fasta --output_vcf output/SOPHIA_needlestack.vcf' > versions.txt
...
# Check if sort command allows sorting in natural order (chr1 chr2 chr10 instead of chr1 chr10 chr2)
if [ `sort --help | grep -c 'version-sort' ` == 0 ]
then
sort_ops="-k1,1d"
else
sort_ops="-k1,1V"
fi
# Add all VCF contents and sort
grep --no-filename -v '^#' VCF/*.vcf | LC_ALL=C sort -t ' ' $sort_ops -k2,2n >> header.txt
mv header.txt output/SOPHIA_needlestack.vcf
...
``
I think the actual error message has been cut, could copy the content of the .command.err
file in the work/84/dfbf6c...
folder?
Also can you check the content of the VCF/chr1_17345370-chrX_14883636_regions.vcf
file in the same folder? This last process is simply merging all VCF produced when splitting your bed into multiple regions, but in your case you have only one .
When reviewing the .command.sh
file, the last command could not be executed (mv header.txt output/all_variants.vcf
) because at the time of executing that command it was in the path work/84/dfbf6c..
and therefore it could not find the specified output path, since it had used a relative path at the time of executing the program. By correcting that, the program worked perfectly. Thank you very much.
Indeed the --output_vcf
parameter is supposed to be only the name of the VCF file (not including its path) but indeed that's confusing. You can use it together with the --output_folder
parameter. We made this choice because there are other outputs but we could easily also allow the --output_vcf
parameter to include a relative path.
Thanks for spotting this, I hope you will find needlestack useful!
Thank you very much for everything, excellent program
Hi, i am trying to run the program using:
The bed file is sorted, the bam files are in the corresponding folder with their respective .bai, and in the reference folder are the .fasta, .fai, .dict, .amb, .ann, .bwt, . pac and .sa.
But I get an error in the execution process. I would appreciate your response, thank you.
Error message: