Closed milesandersonmn closed 2 years ago
Hello, Could you please describe your input data type (WES or WGS) and command? Thanks, Li
It is a WGS.
/data/proj/chilense/30_genomes_outputs/Miles/LinkedSV/linkedsv.py -i /data/proj/chilense/30_genomes_outputs/Miles/10xLinks/SI-GA-D6/outs/phased_possorted_bam.bam -r /data/proj/chilense/30_genomes_outputs/reference/S_chilense_new/S_chilense_reference_rename.fasta -d /data/proj/chilense/30_genomes_outputs/Miles -t 40 --somatic_mode --gap_region empty.bed --black_region_bed empty.bed
Could you please send the following 3 files to fangli2718@gmail.com so that I can have a test?
/data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.node33
/data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.node33.candidates
/data/proj/chilense/30_genomes_outputs/reference/S_chilense_new/S_chilense_reference_rename.fasta.fai
Thanks.
Ok I thought it might be a disk quota issue, but I ran the process again with more disk space and it still didn't work.
The process doesn't create a node.33.candidates file.
Running the process using the cluster local scratch, didn't produce any errors, but it also didn't produce any of the output the final output files. I only have the following files produced from the command:
phased_possorted_bam.bam.arguments phased_possorted_bam.bam.barcode_cov.bed phased_possorted_bam.bam.barcode_statistics phased_possorted_bam.bam.bcd21.gz phased_possorted_bam.bam.bcd22 phased_possorted_bam.bam.bcd22.tmp phased_possorted_bam.bam.fragment_statistics phased_possorted_bam.bam.high_cov.bed phased_possorted_bam.bam.low_mapq.bcd21.gz phased_possorted_bam.bam.node33 phased_possorted_bam.bam.node35 phased_possorted_bam.bam.node53 phased_possorted_bam.bam.node55 phased_possorted_bam.bam.weird_reads.txt
@fangli80 @milesandersonmn Did you solve this problem? I got the exactly same one,
ERROR: Failed to run command: /tools/LinkedSV/scripts/../bin/remove_sparse_nodes /projects/hic/2022_F_CTC/sv_detect/linkedsv/output/SHR/SHR_phased_possorted_bam.bam.node33 /projects/hic/2022_F_CTC/sv_detect/linkedsv/output/SHR/SHR_phased_possorted_bam.bam.node33.candidates 5789 /refs/rn7_ucsc/rn7chr.fa.fai 10
@theshowmustgolangon Sorry for the inconvenience. I can not replicate this problem with my data. Do you mind if you share your dataset with me so that I can test on it ?
Best, Li
@fangli80 I cannot find a node.33.candidates file. The process seem to not create a node.33.candidates file. I shared my google drive with you.
@theshowmustgolangon
It seems that there is only a .fai
file in the shared folder. Is it possible that you share the SHR_phased_possorted_bam.bam
file?
Thanks, Li
@fangli80 Hi, I added a wrong file.After I upload the file, I will get back to you. Thank you for your support!
@fangli80 Hi, it is uploaded my school Onedrive, I shared it with you! Could you check your gmail please? Also, I am uploading one more bam file of another sample's. They both caused the same error. Could you check them please??
Thanks, Pete
Hi, Pete
It's been long while since I worked with the linked read data. But if I recall correctly the problem was I was using the draft reference instead of the reference files created using the Longranger pipeline. I can't tell from your command but the reference genome should be output by Longranger into a directory path such as "~/refdata-myGenome/fasta/genome.fa"
This genome.fa file should be used as your reference argument. Again I'm not 100% positive it's been quite some time. But that is the solution that comes to mind when I try to remember.
Good luck, Miles
@milesandersonmn Hi Miles, I am really thankful for your comment, and I checked my command based on your advice. I think I used a reference genome correctly as you mentioned. I will let you know what caused my errors if @fangli80 find!
Thank you!
I am downloading the bam file. I will let you know the updates after I test on it.
@milesandersonmn @fangli80 I ran linkedsv successfully. The problem was samtools access to shared library related to gcc, libstdc++.so.6 in /usr/lib64 directory. Bedtools and samtools installed by conda were used, but the errors were fixed after I used samtools and bedtools loaded from HPC server I am using. Thank you for your support, and I wish this would be helpful for anyone who get this error.
@fangli80 I uploaded another phased_possorted_bam.bam file and shared it with you. I hope I can get any advice on generating blacklist file.
I'm running an analysis on tomato using 10x reads, and my linkedSV pipeline is failing to run the remove_sparse_nodes command.
Here is a sample from the end of the output log:
[12/26/2021 12:56:31 (227.287 MB)] N95_fragment_length is: 5857 [12/26/2021 12:57:34 (187.400 MB)] finished getting fragment parameters [12/26/2021 12:57:34 (186.352 MB)] searching for paired breakpoints [12/26/2021 12:57:34 (186.352 MB)] searching paired breakpoints [12/26/2021 12:57:34 (186.352 MB)] building nodes from fragments [12/26/2021 12:57:34 (186.352 MB)] reading bcd22 file:/data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.bcd22 [12/26/2021 12:58:00 (755.536 MB)] total number of fragments: 1294628 [12/26/2021 12:58:01 (755.536 MB)] writing to node file [12/26/2021 12:58:38 (187.208 MB)] removing sparse nodes, min_support_fragments is 10 [12/26/2021 12:58:38 (187.208 MB)] Running CMD: /data/proj/chilense/30_genomes_outputs/Miles/LinkedSV/scripts/../bin/remove_sparse_nodes /data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.node33 /data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.node33.candidates 5000 /data/proj/chilense/30_genomes_outputs/reference/S_chilense_new/S_chilense_reference_rename.fasta.fai 10 [12/26/2021 13:46:16 (3.293 MB)] ERROR: Failed to run command: /data/proj/chilense/30_genomes_outputs/Miles/LinkedSV/scripts/../bin/remove_sparse_nodes /data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.node33 /data/proj/chilense/30_genomes_outputs/Miles/phased_possorted_bam.bam.node33.candidates 5000 /data/proj/chilense/30_genomes_outputs/reference/S_chilense_new/S_chilense_reference_rename.fasta.fai 10 [12/26/2021 13:46:16 (4.215 MB)] Return value is: 9