Open Wicker-Lab opened 7 months ago
Hi
It's a bit difficult for me to track the bug without more information. It seems some step in annotating the VCF with repeats fails.
Could you share svim-asm_variants.vcf
from out/1_SV_search
, the TE library and the reference genome?
Hi~
I met the same question. My task showed 'complete' state, but the output directory only contained [1_SV_search 2_Repeat_Filtering] and there was not error message in my task log:
executor > local (43) [46/b0b3f5] process > map_longreads (18) [100%] 20 of 20 ✔ [41/029f00] process > sniffles_sample_call (20) [100%] 20 of 20 ✔ [68/fa7e45] process > sniffles_population_call (1) [100%] 1 of 1 ✔ [5a/63a7ce] process > repeatmask_VCF (1) [100%] 1 of 1 ✔ [f7/095c3b] process > tsd_prep (1) [100%] 1 of 1 ✔ [- ] process > tsd_search - [- ] process > tsd_report - [- ] process > make_graph - [- ] process > graph_align_reads - [- ] process > vg_call - [- ] process > merge_VCFs - Completed at: 08-May-2024 19:04:10 Duration : 2d 8h 9m 43s CPU hours : 2'695.2 Succeeded : 43
I use the GRCh38 as my reference genome and here are my sniffles2_variants.vcf in [out/1_SV_search] and TE library: sniffles2_variants.xlsx ERV_element_seq_library_0318.txt
Any idea to solve this problem~ Thank you for your help!
Hello @Wicker-Lab and @xxYaaoo,
I'm sorry I missed both your messages.
This error previously happened with me due to an issue with the default /tmp
dir not being accessible. During the annotation process, there are several calls made to create a tmp dir. In some machines, the default /tmp
dir space is too small. A typical output is that the folder "2_Repeat_Filtering" is created, but there are no variants in the VCF "genotypes_repmasked_filtered". To check if this is the case, you can look the Nextflow logs for this specific process. They should be located in work/9c/720983*/.command.out
or work/9c/720983*/.command.err
for you @Wicker-Lab and in work/5a/63a7ce*/.command.out
or work/5a/63a7ce*/.command.err
for you @xxYaaoo. Look for something of the sort mktemp: failed to create directory [...]: No space left on device
or unable to create /x/y/z: No space left on device
(or similar). If that's the case, look here. If you can't find anything related to a /tmp
directory issue, send us here the complete logs and we will figure out what's wrong!
another possibility might be the error discussed here: https://github.com/cgroza/GraffiTE/issues/8
Sorry again for the delays, I'm looking forward to get you out of trouble!
Cheers,
Clément
Hello,
First of all: thank you very much for the pipeline.
I ran the command like this:
nextflow run cgroza/GraffiTE --assemblies assemblies.csv --TE_library nrTREP20 --reference Bgt_genome_v3_16 --graph_method pangenie --genotype false
To detect SVs in 10 fully assembled genomes (so I didn't add any reads for mapping so far).
As far as I understood, the 3rd output folder TSD search and especially the pangenome.vcf should still be written, or not?
I only get: [rest_of_path]/out$ ls 1_SV_search 2_Repeat_Filtering
The output files in these folder look "normal" as far as I can tell. For example the file "indels.fa.masked" has many sequences, of which most are at least partially repeat masked.
However the file: genotypes_repmasked_filtered.vcf somehow has no variants, even if there are many in the per sample vcfs.
Also: there is not error message at the end of the run:
executor > local (21) [d3/6be117] process > map_asm (5) [100%] 9 of 9 ✔ [a0/d40a97] process > svim_asm (9) [100%] 9 of 9 ✔ [32/f5dfe6] process > survivor_merge [100%] 1 of 1 ✔ [9c/720983] process > repeatmask_VCF (1) [100%] 1 of 1 ✔ [4f/1f4338] process > tsd_prep (1) [100%] 1 of 1 ✔ [- ] process > tsd_search - [- ] process > tsd_report - Completed at: 25-Apr-2024 11:58:28 Duration : 26m 27s CPU hours : 2.3 Succeeded : 21
Thank you very much for your help!