pauline-ng / SIFT4G_Create_Genomic_DB

Create genomic databases with SIFT predictions. Input is an organism's genomic DNA (.fa) file and the gene annotation file (.gtf). Output will be a database that can be used with SIFT4G_Annotator.jar to annotate VCF files.
GNU General Public License v3.0
25 stars 7 forks source link

Creating local database is always interrupted at aligning step #96

Open clavedec opened 5 months ago

clavedec commented 5 months ago

Dear Pauline,

I have been trying to use make-SIFT-db-all.pl to create a database. It all seems to be going well, and files are being created in the directories singleRecords, fasta and subst (the others are empty). However, I get an email saying the slurm job has failed. It says 'Exit code 255', usually after 11h-12h of run at the step of " Aligning queries with candidate sequences ". Last time it advanced until:

Aligning queries with candidate sequences ... processing database part 1 (size ~1.00 GB): 47.50/100.00%

It doesn't seem to be a memory problem, as the jobs are using less memory than I requested. Any suggestion of what can happening?

Below is the config file I'm using.

Thank you very much for your help!

Best wishes, Clarissa-

--

GENETIC_CODE_TABLE=1 GENETIC_CODE_TABLENAME=Standard MITO_GENETIC_CODE_TABLE=2 MITO_GENETIC_CODE_TABLENAME=Vertebrate Mitochondrial

PARENT_DIR=/n/holyscratch01/edwards_lab/cfcarvalho/Chiroxiphia/Antilophia/07.Non-synonymous/scripts_to_build_SIFT_db ORG=chiroxiphia lanceolata ORG_VERSION=bChiLan1 DBSNP_VCF_FILE=

SIFT4G_PATH=~/sift4g/bin/sift4g

PROTEIN_DB=/n/holyscratch01/edwards_lab/cfcarvalho/Chiroxiphia/Antilophia/07.Non-synonymous/scripts_to_build_SIFT_db/GCF_009829145.1/protein.faa

GENE_DOWNLOAD_DEST=gene-annotation-src CHR_DOWNLOAD_DEST=chr-src LOGFILE=Log.txt ZLOGFILE=Log2.txt FASTA_DIR=fasta SUBST_DIR=subst ALIGN_DIR=SIFT_alignments SIFT_SCORE_DIR=SIFT_predictions SINGLE_REC_BY_CHR_DIR=singleRecords SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores DBSNP_DIR=dbSNP

FASTA_LOG=fasta.log INVALID_LOG=invalid.log PEPTIDE_LOG=peptide.log ENS_PATTERN=ENS SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord

pauline-ng commented 4 months ago

Hi, The error is coming from sift4g -- can you resubmit your issue to https://github.com/rvaser/sift4g/issues

Also, were you able to run the test example?

Thanks, Pauline

clavedec commented 4 months ago

Hi Pauline,

Thank you for your reply. I have submitted the issue in the SIFT4G GitHub page (https://github.com/rvaser/sift4g/issues/37). I found some issues related to mine, but the solutions they found don't seem to apply to my case.

I ran the test example successfully, without any problem.

Thanks again.

Best, Clarissa

noobylf commented 3 months ago

Hi Pauline,

Thank you for your reply. I have submitted the issue in the SIFT4G GitHub page (rvaser/sift4g#37). I found some issues related to mine, but the solutions they found don't seem to apply to my case.

I ran the test example successfully, without any problem.

Thanks again.

Best, Clarissa

Hello, Clarissa and Pauline I ran the same problem, I ran the example file without any problem, but when I built my own database with sift4g, After 20 hours I ran into the following problem and stopped working.

Searching database for candidate sequences

I don't know how to do. Do you have any solutions or suggestions? Thank you very much for your help

Best, Kwame