Open v-mukhina opened 8 months ago
Hi Vera,
Thanks for reaching out and sorry for the delay in response. This error is complaining about the input bam file not being present when ViFi is attempting to process. This could happen if one of the filtering steps that's running before the ViFi step is failing. Therefore, the input file to ViFi is empty and that's why we get the error. Could you please share all the non-empty intermediate fasta/fastq and bam files created by the command? I'm trying to figure out which step of the way is causing the problem.
Best, Sara
Unfortunately I deleted all related files already and switched to another software. However, it looks like the issue is not the bam file itself but the reference one. bwa_idx_load_from_disk error usually pops up when the reference fasta file is not indexed by bwa index. I believe ViFi crashes on the very first bwa command (bwa mem?) that requires those index files for the reference and then all following bam files are empty or just absent.
oh wait I found them! all bam files are empty but fastq files are not Archive.zip
Hi Vera,
Sorry to hear about your troubles with FastViFi. Thanks for sharing the output files. As you mentioned, It looks like the kraken step works well and the ViFi step fails. I could not replicate this problem as it works correctly on my end, using your exact command. I hear your point about index files for reference fasta files and it sounds valid. But it looks like the problem persists after you indexed the GrCh38 reference file in data_repo directory.
ViFi uses viral_data/hpv/grch38_hpv.fas
file to map the input fastq files to the reference human and viral genomes. Do you have this file present in the downloaded viral_data
directory? If so, could you please try indexing this fasta file as well and trying again?
Also, could you please share the version of singularity you are using? I am successfully running tests with singularity version 3.8.6.
The point you mentioned about HG19 reference being hard-coded in the code, is a good catch, but that code is not called for viral read detection.
Best, Sara
this is what i have in the viral_data/hpv folder (viral_data.tar.gz was downloaded from the vifi repository as suggested in the readme):
I've indexed hpv.unaligned.fas on my own to ensure this was not the reason for my issue.
Hi Vera,
I believe I understand the source of problem. There should be a grch38_hpv.fas
and corresponding index file in the viral_data/hpv
folder. This file is automatically created using these two lines in the setup_linux_mac.sh
in the ViFi repo. I suggest running the whole script setup_linux_mac.sh
. Moreover, as you already downloaded the data_repo
and viral_data
, please make sure to copy/move them to where setup_linux_mac.sh
script is (in ViFi directory), before running it so it does not download the two directories again. The script creates human-viral-reference files for three viruses: HPV, HBV and HCV. If you are interested only in HPV, feel free to edit this line on the script to only run for hpv
.
You should have a grch38_hpv.fas
and corresponding index file in the viral_data/hpv
directory after running this command. If you cannot see these files after running the setup script, please let me know.
Hi Sara, could you please help me? My issue is probably related to this one https://github.com/sara-javadzadeh/FastViFi/issues/10 I'm using following singularity command to run fastvifi on a test files
Right after kraken finishes I face a bwa-related error
All subsequent bam files are also empty.
I have data_repo and viral_data loaded from the google drive using link from the readme and there are no files looking like bwa index files. This error does not disappear after indexing hg38 and hg19 fasta files in data_repo. How do i fix this error?
Btw, it appears that hg19 value is hardcoded here https://github.com/sara-javadzadeh/ViFi/blob/b1a649685af0620a1d16a8940bb3e21db0fa17b5/scripts/cluster.sh#L10C1-L17C1 I am not sure if this script is used anywhere.
Best, Vera