Closed rebecca-sh closed 1 year ago
Hi,
Can you list the files in your Super-Scaffold_16 directory? (Can you ls and show the filenames)
Do your VCF file chromosome names match the filenames in the Super-Scaffold_16 directory?
Hello,
Thanks for your quick response! Here are all the files in the directory superscaffold16dir.txt
Yes all chromosome names match. Thanks for checking this!
Can you attach or show me what's in
Super-Scaffold_16/CHECK_GENES.LOG ?
cat CHECK_GENES.LOG Chr Genes with SIFT Scores Pos with SIFT scores Pos with Confident Scores Super-Scaffold_16 100 (93/93) 100 (323194/323194) 74(238145/323194)
ALL 100 (93/93) 100 (323194/323194) 74(238145/323194)
Thanks, your database is built correctly.
Can you show me the first few lines of your VCF file ?
cat <vcf_file> | grep -v ^# | head -5
cat dm_superscaffold16.vcf | grep -v ^# | head -5 Super-Scaffold_16 187620 . C T 100 . MUSNIGT00018393 Super-Scaffold_16 29864325 . G A 100 . MUSNIGT00001300 Super-Scaffold_16 4963050 . C A 100 . MUSNIGT00018314 Super-Scaffold_16 32415183 . G A 100 . MUSNIGT00001351 Super-Scaffold_16 32390272 . A T 100 . MUSNIGT00001351
Hi Rebecca,
The 8th column "INFO" in the VCF file is required. Please add that column.
Thanks for your feedback Pauline! I'm using a different ferret genome to try and annotate my variants, but it still doesn't seem to be working - I get this error that the .regions file still could not be found.
I have tried to annotate my variants with the already established ferret database by aligning the coordinates and this seems to work. I think it could be an issue with the install as in looking closer at some of my runs this is the error I am getting:
/opt/scripts_to_build_SIFT_db/make_regions_file.py: line 1: import: command not found
/opt/scripts_to_build_SIFT_db/make_regions_file.py: line 2: import: command not found
/opt/scripts_to_build_SIFT_db/make_regions_file.py: line 3: import: command not found
/opt/scripts_to_build_SIFT_db/make_regions_file.py: line 4: import: command not found
/opt/scripts_to_build_SIFT_db/make_regions_file.py: line 6: syntax error near unexpected token (' /opt/scripts_to_build_SIFT_db/make_regions_file.py: line 6:
def get_pos (line):'
usr/bin/env: python3: No such file or directory /usr/bin/env: python3: No such file or directory /usr/bin/env: python3: No such file or directory /usr/bin/env: python3: No such file or directory /usr/bin/env: python3: No such file or directory /usr/bin/env: python3: No such file or directory
I will check with how the software was compiled.
Thanks again
You need python3 installed. Once you have python3, rerun the generation of the database, you should gave regions files in the folder.
Hello,
I've had some issues generating a database for my species. I have managed to get sift4g to run and it appears it is working. For example this is the output from one of my runs:superscaffold16.txt
However, when I try to run Sift4G Annotator, I've realised that no .regions files are generated during my runs, no files populate my SIFT_alignments directory and there are no files in my singleRecords_with_scores directory at the end of the run. Here is what the PARENT_DIR for one of my runs looks like:
shawr@EI-HPC interactive Super-Scaffold_16_PARENTDIR]$ ls -lthr total 1.1M drwxrwx--- 2 shawr EI_ga011 0 Mar 25 11:19 SIFT_alignments drwxrwx--- 2 shawr EI_ga011 0 Mar 25 11:19 dbSNP -rwxrwx--- 1 shawr EI_ga011 0 Mar 25 11:20 invalid.log -rwxrwx--- 1 shawr EI_ga011 0 Mar 25 11:20 Log2.txt drwxrwx--- 2 shawr EI_ga011 6.0K Mar 25 11:20 subst drwxrwx--- 2 shawr EI_ga011 4.9K Mar 25 11:20 fasta -rwxrwx--- 1 shawr EI_ga011 49K Mar 25 11:20 peptide.log -rwxrwx--- 1 shawr EI_ga011 71 Mar 25 11:20 fasta.log -rwxrwx--- 1 shawr EI_ga011 51K Mar 25 11:20 all_prot.fasta drwxrwx--- 2 shawr EI_ga011 12K Mar 25 12:09 SIFT_predictions drwxrwx--- 2 shawr EI_ga011 291 Mar 25 12:09 singleRecords drwxrwx--- 2 shawr EI_ga011 110 Mar 25 12:10 Super-Scaffold_16 drwxrwx--- 2 shawr EI_ga011 0 Mar 25 12:10 singleRecords_with_scores drwxrwx--- 2 shawr EI_ga011 77 Mar 25 12:12 chr-src drwxrwx--- 2 shawr EI_ga011 112 Mar 25 12:16 gene-annotation-src
I have also removed any protein sequences that had any unwanted characters - 'X', '*', '-'
Any help would be really appreciated! Many thanks,
Becky