Closed SamuelGreenrod closed 1 year ago
UPDATE: MGEfinder still isn't running on my own data but does run one the example data. This suggests there is a problem with my files, although I'm not quite sure what. I've tried to replicate the example data by: 1) using assemblies constructed using Spades with the command mentioned in the instructions; 2) using a bam file made following the instructions with bwa mem and formatbam; 3) using a single contig genome file (in my case one downloaded from NCBI).
Please could you check what could cause the error message so I can change my input files accordingly? Thank you.
I apologize for losing track of this issue. Were you able to resolve this by chance?
@durrantmm @SamuelGreenrod I am experiencing this problem as well, having followed all the preparation steps in the MGEfinder tutorial to prepare my own data. Just as SamuelGreenrod, I am able to run the pipeline on the tutorial dataset without issue. Current version of snakemake: 3.13.3 Thanks for any help!
How many different genomes did you use as input? The most likely cause is that it couldn't identify any potential insertion termini.
Only one genome, from evolved strains; I was test running with these files before we look at sequencing from a strain with a known phage. Is the "No termini found in the input file..." error what comes up if the fastas fed in have no novel insertions? Thank you for being at your computer just now :)
Yes, that's correct, it looks like it couldn't find any novel insertions.
Perfect! Thank you so much.
I'm running MGEfinder on my own data and have used the pipeline described in the readme document. This is using a complete assembly called "Ancestor.fna", a sample assembly assembled using Unicycler labelled "48con5.fna", and the bam and ba.bai files made using bwa mem followed by the mgefinder formatbam. When I run it I get the error message:
Parsing inferseq files Combining the inferseq files... Loading file 1/3: workdir/01.mgefinder/Ancestor/48con5/03.inferseq_assembly.48con5.Ancestor.tsv Loading file 2/3: workdir/01.mgefinder/Ancestor/48con5/03.inferseq_reference.48con5.Ancestor.tsv Loading file 3/3: workdir/01.mgefinder/Ancestor/48con5/03.inferseq_overlap.48con5.Ancestor.tsv Deleting old database directory... No termini found in the input file... Waiting at most 5 seconds for missing files. Error in job make_database while creating output files workdir/02.database/Ancestor/Ancestor.database.fna, workdir/02.database/Ancestor/Ancestor.database.fna.1.bt2. MissingOutputException in line 192 of /users/steg500/.conda/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/denovo.original.Snakefile: Missing files after 5 seconds: workdir/02.database/Ancestor/Ancestor.database.fna workdir/02.database/Ancestor/Ancestor.database.fna.1.bt2 This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait. Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message
I've tried increasing the latency wait time but it doesn't recognise the --latency-wait command when I run it with mgefinder workflow denovo. Do you have any ideas how I could fix this? Thank you!