pauline-ng / SIFT4G_Create_Genomic_DB

Create genomic databases with SIFT predictions. Input is an organism's genomic DNA (.fa) file and the gene annotation file (.gtf). Output will be a database that can be used with SIFT4G_Annotator.jar to annotate VCF files.
GNU General Public License v3.0
25 stars 7 forks source link

Error when construct test database (homo sapiens) #29

Closed ruru-adra closed 3 years ago

ruru-adra commented 3 years ago

Hi Pauline

I am still fail to build my SIFT4G database. But then I tried to build the homo_sapiens_small sample database. It returned an error during "making single records file"

Use of uninitialized value $fasta_subseq in concatenation (.) or string at make-single-records-BIOPERL.pl line 210, line 2. Use of uninitialized value $fasta_subseq in concatenation (.) or string at make-single-records-BIOPERL.pl line 210, line 3. . The SIFT_alignments, SIFT_predictions folders, singleRecords_with_scores are empty. all_prot.fasta, fasta.log, invalid.log and peptide.log was generated.

I tried "../../../sift4g/bin/sift4g -d ../protdb/uniprot.fasta -q all_prot.fasta --out SIFT_predictions/" , returned no error and generated files in the SIFT_prediction folder.

Then I tried SIFT4G, "./bin/sift4g -q ./test_files/query.fasta --subst ./test_files/ -d ./test_files/sample_protein_database.fa". The SIFT4G programme works well. No error found using sample data.

My gcc version is 9.3.0.

Thank you

pauline-ng commented 3 years ago

If "./bin/sift4g" works and "../../../sift4g/bin/siftg4" doesn't then that means the SIFT4G path set in the config file is incorrect. Please write the full path to SIFT4G (not a relative path), that way it can be run from any directory.

The following are warnings that can be ignored.


Use of uninitialized value $fasta_subseq in concatenation (.) or string at make-single-records-BIOPERL.pl line 210, <IN_TX> line 2.

Use of uninitialized value $fasta_subseq in concatenation (.) or string at make-single-records-BIOPERL.pl line 210, <IN_TX> line 3.
ruru-adra commented 3 years ago

Pauline,

I deleted the files, fixed the path and run again "make-SIFT-db-all.pl" using sample database. Looks good and no previous warning. SIFT_predicted folder is not empty. singleRecords_with_scores and SIFT_alignments are empty.

But, i got this "Can't exec "python": No such file or directory at make-sift-scores-db-batch.pl line 66. Can't exec "python": No such file or directory at make-sift-scores-db-batch.pl line 66. checking the databases Can't exec "python": No such file or directory at make-SIFT-db-all.pl line 137. Can't exec "python": No such file or directory at make-SIFT-db-all.pl line 137. zipping up ./test_files/homo_sapiens_small/chr-src/* All done! . Error/warning? and does it consider sample database was successfully developed? The invalid.log also contains sequences. Btw, what does the invalid.log refers to?

Thank you.

pauline-ng commented 3 years ago

Do you have python installed on your computer? Please install python and rerun.

ruru-adra commented 3 years ago

Yes, python (python2 and python3) was installed on my computer.

ruru-adra commented 3 years ago

Hai Pauline,

I update my python version and rerun the command. But I got this error:

done getting all the scores populating databases File "make_regions_file.py", line 61 print 'check_SIFTDB.py ' ^ SyntaxError: Missing parentheses in call to 'print'. Did you mean print('check_SIFTDB.py ')? File "make_regions_file.py", line 61 print 'check_SIFTDB.py ' ^ SyntaxError: Missing parentheses in call to 'print'. Did you mean print('check_SIFTDB.py ')? checking the databases File "check_SIFTDB.py", line 44 if predicted_on > 0: ^ TabError: inconsistent use of tabs and spaces in indentation File "check_SIFTDB.py", line 44 if predicted_on > 0: ^ TabError: inconsistent use of tabs and spaces in indentation zipping up /home/bioinfo/bioinfo_sware/scripts_to_build_SIFT_db/test_files/homo_sapiens_small/chr-src/* All done!

While populating the database, there were some files in the singleRecords_with_scores folder. But when I got this error, the singleRecords_with_scores folder is empty.

pauline-ng commented 3 years ago

Please clone the latest repo - I updated the python scripts to be python3 compatible 1 week ago.

ruru-adra commented 3 years ago

Hai Pauline,

Done, and the test database was successfully developed. I will try with my own datasets.

Thank you