Open peterjc opened 5 years ago
Should I file a separate issue on the unclear failure if used with a GFF file without embedded FASTA sequences? Looking at the code it seems to try to give a warning, https://github.com/sanger-pathogens/SnpEffWrapper/blob/v0.2.5/snpEffWrapper/wrapper.py#L297 - but wouldn't it be better to actually abort without calling snpEff build?
hey peterjc,
Might I ask how you merged your fasta and GFF in the end? I used cat .gff .fasta > .fasta.gff and I am getting the following error (which may be unrelated to how I've combined the files, but just want to check!)
[2018-10-19 15:45:10,601] INFO: Checking that the VCF and GFF contigs are consistent [2018-10-19 15:45:11,724] INFO: Building snpeff database Traceback (most recent call last): File "/home/manager/miniconda3/envs/ddocent_env/lib/python3.6/site-packages/snpEffWrapper/wrapper.py", line 222, in _snpeff_build_database subprocess.check_call(command, stdout=stdout, stderr=stderr) File "/home/manager/miniconda3/envs/ddocent_env/lib/python3.6/subprocess.py", line 291, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/usr/bin/java', '-Xmx4g', '-jar', '/media/sf_SharedDrive/Download/snpEff/snpEff.jar', 'build', '-gff3', '-verbose', 'data', '-c', '/home/manager/WGS/WGS analysis/snpeff_data_dir_hrk9sc02/config']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/manager/miniconda3/envs/ddocent_env/bin/snpEffBuildAndRun", line 45, in
What you are likely missing is the special line ##FASTA\n
before the FASTA file starts with ">"...
Luckily I had documented this locally, I did it once by hand and then came up with the following as a reproducible alternative:
bash -c "cat annotation_only.gff; echo '##FASTA' ; cat reference.fasta" > annotation_with_fasta.gff
You could make a dummy file with the magic line, and then concatenate the three files (in order) to make your combined files, but I used the echo command here instead.
Hey peter!
Thanks for that, unfortunately the same error message appears after using your cat command to put in that extra line, so I'm back at square one I feel!
Thanks again,
Gordon
I suspect there is something else "wrong" with your GFF file then - I would suggest opening a new issue, and offering to share the files directly with the tool authors (or if you can, posting them online, e.g. via https://gist.github.com).
Yes I will open another issue as now it seems it's unrelated to how the fasta and GFF are merged.
Thanks,
G
The README says "The GFF must contain the reference sequence in Fasta format"
This seems to explain why our first attempt to use SnpEffWrapper failed (snpEff build could not find the FASTA files in the temporary directory). It would be nice to optionally allow passing a FASTA file for the assembly separately from the GFF file.