Gabaldonlab / perSVade

perSVade: personalized Structural Variation detection
GNU General Public License v3.0
36 stars 5 forks source link

Bug in perSVade optimize_parameters module #19

Closed snitkin-lab closed 8 months ago

snitkin-lab commented 9 months ago

Hi,

There might be a typo in perSVade optimize_parameters module. Instead of looking for aligned_reads.bam.sorted, it looks for aligned_reads/aligned_reads.sorted.bam file and throw this error

Traceback (most recent call last): File "/perSVade/scripts/get_cov_genes", line 128, in sorted_bam, index_bam = fun.get_sorted_bam_in_outdir(opt.sortedbam, opt.outdir) File "/perSVade/scripts/sv_functions.py", line 21250, in get_sorted_bam_in_outdir soft_link_files(get_fullpath(perSVade_sorted_bam), sorted_bam) File "/perSVade/scripts/sv_functions.py", line 2900, in soft_link_files if file_is_empty(origin): raise ValueError("The origin %s should exist"%origin) ValueError: The origin SAMN09111955/aligned_reads/aligned_reads.sorted.bam should exist

I renamed aligned_reads.bam.sorted file to aligned_reads.sorted.bam and it ran without throwing any error.

MikiSchikora commented 9 months ago

Good morning,

Thanks for noting this. How exactly did you obtain this error? Can you attach the command used and the full log? It seems to me you are getting the error while running get_cov_genes, not optimize_parameters right?

If I am interpreting this correctly, you provided -sbam SAMN09111955/aligned_reads/aligned_reads.sorted.bam to get_cov_genes module, and this file does not exist right? The file provided through -sbam should exist, as it is the input, which is the source of the problem. Does this make sense?

Best,

Miquel Àngel Schikora Tamarit, PhD BSC-IRB

weiydcn commented 9 months ago

I also met this bug,when I run “perSVade align_reads ”

nohup 1>S0.log 2>&1 docker docker run --rm -v /mnt/02_data/weiyidong/work/2024.1.24_Svcall/r498/persvade:/genome \
-v /mnt/02_data/weiyidong/work/2024.1.24_Svcall/r498/persvade/output_S0:/output_directory \
-v /mnt/02_data/weiyidong/work/2024.1.24_Svcall/r498/01-clean-data_L:/reads persvade python \
-u ./scripts/perSVade align_reads \
-r /genome/genome.fasta \
-o /output_directory \
-f1 /reads/S0_1.fq.gz -f2 /reads/S0_2.fq.gz&

the output file is aligned_reads.bam.sorted instead of “aligned_reads.sorted.bam”. So if you run the downstream command followed by "EXAMPLE" section in WIKI, it would encounter an error.

alipirani88 commented 9 months ago

After fixing the error - the logs were overwritten but here are the commands that I used

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade trim_reads_and_QC -f1 /nfs/turbo/umms-esnitkin/Project_Cauris/Sequence_data/fastq/Public_samples/cluster_B11245/SAMN09111955_1.fastq.gz -f2 /nfs/turbo/umms-esnitkin/Project_Cauris/Sequence_data/fastq/Public_samples/cluster_B11245/SAMN09111955_2.fastq.gz -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/trimmed_reads --fraction_available_mem 0.25 --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade align_reads -f1 SAMN09111955/trimmed_reads/trimmed_reads1.fastq.gz -f2 SAMN09111955/trimmed_reads/trimmed_reads2.fastq.gz --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads --fraction_available_mem 0.25 --threads 2'

cp /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.bam.sorted /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam && cp /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.bam.sorted.bai /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam.bai

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade infer_repeats --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/repeat_inference --fraction_available_mem 0.25 --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade find_homologous_regions --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/find_hom_regions --fraction_available_mem 0.25 --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade optimize_parameters --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/parameter_optimization -sbam /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam --repeats_file /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/repeat_inference/combined_repeats.tab --regions_SVsimulations random --simulation_ploidies haploid --fraction_available_mem 0.25 --mitochondrial_chromosome no_mitochondria --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade call_SVs --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/call_SVs -sbam /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam --SVcalling_parameters /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/parameter_optimization/optimized_parameters.json --repeats_file /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/repeat_inference/combined_repeats.tab --fraction_available_mem 0.25 --mitochondrial_chromosome no_mitochondria --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade call_CNVs --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/call_CNVs -sbam /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam -p 1 --cnv_calling_algs HMMcopy,AneuFinder --window_size_CNVcalling 500 --fraction_available_mem 0.25 --mitochondrial_chromosome no_mitochondria --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade integrate_SV_CNV_calls -o output/integrated_SV_CNV_calls --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -p 1 -sbam /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam --outdir_callSVs  /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/call_SVs --outdir_callCNVs /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/call_CNVs --repeats_file skip --fraction_available_mem 0.25 --mitochondrial_chromosome no_mitochondria --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade annotate_SVs -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/annotate_SVs --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -gff /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.gff -mcode 3 -gcode 1 --SV_CNV_vcf /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/integrated_SV_CNV_calls/SV_and_CNV_variant_calling.vcf --fraction_available_mem 0.25 --mitochondrial_chromosome no_mitochondria --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade call_small_variants -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/small_vars --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -p 1 -sbam /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam --repeats_file /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/repeat_inference/combined_repeats.tab --callers bcftools,freebayes,HaplotypeCaller --min_AF 0.9 --min_coverage 20 --fraction_available_mem 0.25 --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade annotate_small_vars -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/annotate_small_vars --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -gff /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.gff -mcode 3 -gcode 1 --merged_vcf /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/small_vars/merged_vcfs_allVars_ploidy1.vcf --fraction_available_mem 0.25 --mitochondrial_chromosome no_mitochondria --threads 2'

singularity exec -e /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/singularity/mikischikora_persvade_v1.02.6.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade get_cov_genes -o /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/get_cov_genes --ref /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.fasta -gff /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/reference/B11245/B11245_funannotate/annotate_results/B11245.gff -sbam /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam --fraction_available_mem 0.25 --threads 2'

I fixed it by just renaming it with

cp /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.bam.sorted /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam && cp /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.bam.sorted.bai /scratch/esnitkin_root/esnitkin/apirani/Project_Cauris/Analysis/2024_02_06_perSVade_analysis/cluster_B11245//SAMN09111955/aligned_reads/aligned_reads.sorted.bam.bai

Sorry about the absolute paths.

MikiSchikora commented 9 months ago

Hi,

Alright, so I see that there is a typo in the EXAMPLE section of the wiki right? This is fixed now. But perSVade is consistently generating a aligned_reads.bam.sorted (not aligned_reads.sorted.bam) right? With this your error is solved right? I apologize for the confusing EXAMPLE nomeclature.

Best, Miquel Àngel

alipirani88 commented 9 months ago

Thanks for pointing that out.

MikiSchikora commented 9 months ago

Hi,

No worries. So this is solved now right?

Best, Miquel Àngel

alipirani88 commented 9 months ago

Hi Miquel,

Yes, this is resolved now. Thank you for helping me with this and thanks for putting together this pipeline.