pauline-ng / SIFT4G_Create_Genomic_DB

Create genomic databases with SIFT predictions. Input is an organism's genomic DNA (.fa) file and the gene annotation file (.gtf). Output will be a database that can be used with SIFT4G_Annotator.jar to annotate VCF files.
GNU General Public License v3.0
21 stars 7 forks source link

* processing database part 1 (size ~0.25 GB): 100.00/100.00% * #80

Closed wanggshuoo closed 1 year ago

wanggshuoo commented 1 year ago

Hi Pauline, I have some questions about the result I got when I make the SIFT4G database. Here is the result of the Homo sapiens example.

converting gene format to use-able input done converting gene format making single records file done making single records template making noncoding records file done making noncoding records make the fasta sequences done making the fasta sequences start siftsharp, getting the alignments /public1/home/sc60080/miniconda2/bin/sift4g -d /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/homo_sapiens_small/gene-annotation-src/Homo_sapiens.GRCh38.pep.all.fa -q /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/homo_sapiens_small/all_prot.fasta --subst /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/homo_sapiens_small/subst --out /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/homo_sapiens_small/SIFT_predictions --sub-results Checking query data and substitutions files

Searching database for candidate sequences

Is it all right? I haven't seen "All done!",so I doubt that if this is right. And here is the result of my research object.

done making the fasta sequences start siftsharp, getting the alignments /public1/home/sc60080/miniconda2/bin/sift4g -d /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/Rehmannia_chingii/gene-annotation-src/N01_Chr_genome_final_gene.gff3.pep.fa -q /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/Rehmannia_chingii/all_prot.fasta --subst /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/Rehmannia_chingii/subst --out /public1/home/sc60080/sc60080/ws2/scripts_to_build_SIFT_db/test_files/Rehmannia_chingii/SIFT_predictions --sub-results Checking query data and substitutions files

Searching database for candidate sequences

It seems that I got the same result, but for my research object ,it takes about 30 minutes to making the database, it takes so short time that I can't believe it all done.

Any advice? Thanks, wangshuo

pauline-ng commented 1 year ago

If your genome is small, it will finish quickly. I don't see any errors, so this looks right.

wanggshuoo commented 1 year ago

Hi Pauline, It seems there is not error, but there is nothing in my /,both homo sapiens and my research object .There is my configuration file for the example of homo sapiens. a5ac487492290f88cc9d4da81e394f0

And there is my configuration file for my research object. 23b2362d6d36b1edbd834081e49b579

In the destination file, it's all empty in SIFT_alignments, singleRecords_with_scores,SIFT_predictions and/,both homo sapiens and my research object. 605c8b349b56b56757cbc8c3c0ff6be 9143cade0466022df904420a0318b56

This happens when I use perl make-SIFT-db-all.pl -config test_files/homo_sapiens-test.txt . I just really can't proceed it ,because I don't know what to do next.

Thank you so much for your help. wangshuo

pauline-ng commented 1 year ago

Can you check installation of SIFT4G by following these instructions