pauline-ng / SIFT4G_Create_Genomic_DB

Create genomic databases with SIFT predictions. Input is an organism's genomic DNA (.fa) file and the gene annotation file (.gtf). Output will be a database that can be used with SIFT4G_Annotator.jar to annotate VCF files.
GNU General Public License v3.0
21 stars 7 forks source link

Making database for Culex Tarsalis ends without after processing queries with no warning or error #83

Open Afei99357 opened 10 months ago

Afei99357 commented 10 months ago

Hello. thank you for help me with the previous issue. After address the issue, the processing continue to this then stop for no reasons? and I check the database in the folder, there is no files there at all. Could you please help me with this issue?

here is the running code

Screenshot 2023-09-01 at 11 57 18 AM

####### my config file######## GENE_DOWNLOAD_SITE=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/gtf//bacteria_11_collection/candidatus_carsonella_ruddii_pv/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz PEP_FILE=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/pep/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.pep.all.fa.gz CHR_DOWNLOAD_SITE=ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/

GENETIC_CODE_TABLE=1 GENETIC_CODE_TABLENAME=Standard MITO_GENETIC_CODE_TABLE=1 MITO_GENETIC_CODE_TABLENAME=Standard

PARENT_DIR=/sift4g/run_sift ORG=Culex Tarsalis ORG_VERSION=v1.0.a1

Running SIFT 4G

SIFT4G_PATH=/sift4g/bin/sift4g PROTEIN_DB=/landscape_genetics/SIFT/uniref90.fasta

Sub-directories, don't need to change

LOGFILE=Log.txt ZLOGFILE=Log2.txt GENE_DOWNLOAD_DEST=gene-annotation-src CHR_DOWNLOAD_DEST=chr-src FASTA_DIR=fasta SUBST_DIR=subst SIFT_SCORE_DIR=SIFT_predictions SINGLE_REC_BY_CHR_DIR=singleRecords/ SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores DBSNP_DIR=dbSNP

Doesn't need to change

FASTA_LOG=fasta.log INVALID_LOG=invalid.log PEPTIDE_LOG=peptide.log ENS_PATTERN=ENS SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord

pauline-ng commented 9 months ago

Can you run the test file?

Afei99357 commented 7 months ago

I tried your docker installation and run on the test file, everything looks okay until the same issue as above. suddenly crashed and stop here:

I am watching it do something like: Searching database for candidate sequences

processing database part 1 (size ~0.25 GB): 100.00/100.00% *

I remember it finished at least part 220, and somehow must be crashed somehow processing other part of the databases!

here is the terminal screen, somehow it does not show the processing database part when it crashed.

I have no name!@bc8b48e0e31e:/Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB$ perl make-SIFT-db-all.pl -config ./test_files/homo_sapiens-test.txt entered mkdir /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/GRCh38.83 converting gene format to use-able input done converting gene format making single records file done making single records template making noncoding records file done making noncoding records make the fasta sequences done making the fasta sequences start siftsharp, getting the alignments /sift4g/bin/sift4g -d /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta -q /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta --subst /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst --out /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/SIFT_predictions --sub-results Checking query data and substitutions files

Here is the configuration for the homo_sapiens_small test file:

GENETIC_CODE_TABLE=1 GENETIC_CODE_TABLENAME=Standard MITO_GENETIC_CODE_TABLE=2 MITO_GENETIC_CODE_TABLENAME=Vertebrate Mitochondrial

PARENT_DIR=/Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small ORG=homo_sapiens ORG_VERSION=GRCh38.83 DBSNP_VCF_FILE=Homo_sapiens.vcf.gz

Running SIFT4G, this path works for the Dockerfile

SIFT4G_PATH=/sift4g/bin/sift4g

PROTEIN_DB needs to be uncompressed

PROTEIN_DB=/Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta

Sub-directories, don't need to change

GENE_DOWNLOAD_DEST=gene-annotation-src CHR_DOWNLOAD_DEST=chr-src LOGFILE=Log.txt ZLOGFILE=Log2.txt FASTA_DIR=fasta SUBST_DIR=subst ALIGN_DIR=SIFT_alignments SIFT_SCORE_DIR=SIFT_predictions SINGLE_REC_BY_CHR_DIR=singleRecords SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores DBSNP_DIR=dbSNP

Doesn't need to change

FASTA_LOG=fasta.log INVALID_LOG=invalid.log PEPTIDE_LOG=peptide.log ENS_PATTERN=ENS SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord

do you know what is going on here? why everytime it crash here?

pauline-ng commented 7 months ago

What's your docker command? Did you mount all the volumes?

Also, I tested this on linux, I've never tested it on Windows.

Afei99357 commented 7 months ago

Thank you for your quick responds! I modified the comments above a little bit if you haven't seen that yet. Also even the first one, i was originally running it on docker because my labtop is Macbook with apple m1 chip.

So today i tried your new way to do it with your docker command. here is the command

docker build --platform linux/amd64 -t sift4g_db .

docker run -it --user $(id -u):$(id -g) -v /Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker:/Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker -v /Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker/bigdrive:/Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker/bigdrive sift4g_db /bin/bash

Afei99357 commented 7 months ago

my initial thought for this problem is in the make-SIFT-db-all.pl file, line 122 to line 124. some how the $sift4g_command does not execute right. and it just exit without any error and warning?

my $sift4g_command = $meta_hash{"SIFT4G_PATH"} . " -d " . $meta_hash{"PROTEIN_DB"} . " -q " . $all_prot_fasta . " --subst " . $meta_hash{"PARENT_DIR"} . "/" . $meta_hash{"SUBST_DIR"} . " --out " . $meta_hash{"PARENT_DIR"} . "/" . $meta_hash{"SIFT_SCORE_DIR"} . " --sub-results " ;

print $sift4g_command . "\n";

$siftsharp_command;

$sift4g_command;

if ($? != 0) { exit (-1); }

Right now, I am running the $sift4g_command in myy terminal just use sift4g based on the results i get from previous steps, this error must somewhere from sift4g if it stop the same as before.

pauline-ng commented 7 months ago

Go into your container and then check that all the paths are there

1) Run the container docker run -it --user (id -g) -v /Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker:/Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker -v /Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker/bigdrive:/Users/ericliao/Desktop/WNV_project_files/landscape_genetics/SIFT/sift_docker/bigdrive sift4g_db /bin/bash

2) Go through each of the paths and make sure they exist for this command. This is the SIFT4G command -- check that all of the paths exist. /sift4g/bin/sift4g -d /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta -q /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta --subst /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst --out /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/SIFT_predictions --sub-results

Do

/sift4g/bin/sift4g   # does this show it's an executable?

# does ls below show the files?
ls  /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta 
ls /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta
ls /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst 
... etc
Afei99357 commented 7 months ago

i think the mounting so far is right, because i am able to run the sift4g on my docker container right now. it is still processing

but this is the results after run your recommendations.

I have no name!@bc8b48e0e31e:/$ /sift4g/bin/sift4g [ERROR]: missing option -q (query file) I have no name!@bc8b48e0e31e:/$ ls /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta I have no name!@bc8b48e0e31e:/$ ls /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta I have no name!@bc8b48e0e31e:/$ ls /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst ENST00000269844.subst ENST00000398085.subst ENST00000460212.subst ENST00000485621.subst ENST00000270112.subst ENST00000398132.subst ENST00000460259.subst ENST00000485649.subst ENST00000270139.subst ENST00000398133.subst ENST00000460305.subst ENST00000485790.subst ENST00000270142.subst ENST00000398137.subst ENST00000460328.subst ENST00000485865.subst ENST00000270162.subst ENST00000398158.subst ENST00000460521.subst ENST00000485895.subst ENST00000270172.subst ENST00000398165.subst ENST00000460557.subst ENST00000485933.subst ENST00000270190.subst ENST00000398168.subst ENST00000460679.subst ENST00000486002.subst ENST00000284878.subst ENST00000398208.subst ENST00000460704.subst ENST00000486023.subst ENST00000284881.subst ENST00000398224.subst ENST00000460734.subst ENST00000486098.subst ENST00000284885.subst ENST00000398225.subst ENST00000460740.subst ENST00000486126.subst ENST00000284894.subst ENST00000398227.subst ENST00000460783.subst ENST00000486275.subst ENST00000284971.subst ENST00000398229.subst ENST00000460883.subst ENST00000486303.subst ENST00000284984.subst ENST00000398232.subst ENST00000460886.subst ENST00000486363.subst ENST00000284987.subst ENST00000398234.subst ENST00000460905.subst ENST00000486367.subst ENST00000285667.subst ENST00000398236.subst ENST00000460932.subst ENST00000486427.subst ENST00000285670.subst ENST00000398341.subst ENST00000460989.subst ENST00000486519.subst ENST00000285679.subst ENST00000398352.subst ENST00000461088.subst ENST00000486520.subst ENST00000285681.subst ENST00000398367.subst ENST00000461123.subst ENST00000486719.subst ENST00000286777.subst ENST00000398397.subst ENST00000461281.subst ENST00000486741.subst ENST00000286788.subst ENST00000398405.subst ENST00000461686.subst ENST00000486746.subst ENST00000286791.subst ENST00000398431.subst ENST00000461785.subst ENST00000486812.subst ENST00000286800.subst ENST00000398437.subst ENST00000461889.subst ENST00000486902.subst ENST00000286808.subst ENST00000398449.subst ENST00000462050.subst ENST00000486937.subst ENST00000286827.subst ENST00000398457.subst ENST00000462212.subst ENST00000487113.subst ENST00000286835.subst ENST00000398499.subst ENST00000462214.subst ENST00000487155.subst ENST00000288319.subst ENST00000398505.subst ENST00000462224.subst ENST00000487266.subst ENST00000288344.subst ENST00000398511.subst ENST00000462262.subst ENST00000487297.subst ENST00000288350.subst ENST00000398548.subst ENST00000462267.subst ENST00000487374.subst ENST00000288383.subst ENST00000398585.subst ENST00000462274.subst ENST00000487427.subst ENST00000290130.subst ENST00000398598.subst ENST00000462349.subst ENST00000487434.subst ENST00000290155.subst ENST00000398600.subst ENST00000462467.subst ENST00000487711.subst ENST00000290178.subst ENST00000398632.subst ENST00000462566.subst ENST00000487869.subst ENST00000290200.subst ENST00000398646.subst ENST00000462569.subst ENST00000487909.subst ENST00000290219.subst ENST00000398647.subst ENST00000462571.subst ENST00000487951.subst ENST00000290244.subst ENST00000398652.subst ENST00000462742.subst ENST00000487990.subst ENST00000290299.subst ENST00000398714.subst ENST00000463060.subst ENST00000487994.subst ENST00000290310.subst ENST00000398753.subst ENST00000463070.subst ENST00000488166.subst ENST00000290349.subst ENST00000398897.subst ENST00000463138.subst ENST00000488167.subst ENST00000290354.subst ENST00000398905.subst ENST00000463216.subst ENST00000488368.subst ENST00000290399.subst ENST00000398907.subst ENST00000463276.subst ENST00000488392.subst ENST00000291525.subst ENST00000398910.subst ENST00000463599.subst ENST00000488492.subst ENST00000291526.subst ENST00000398911.subst ENST00000463631.subst ENST00000488522.subst ENST00000291527.subst ENST00000398919.subst ENST00000463668.subst ENST00000488526.subst ENST00000291532.subst ENST00000398930.subst ENST00000463674.subst ENST00000488556.subst ENST00000291535.subst ENST00000398932.subst ENST00000463771.subst ENST00000488577.subst ENST00000291536.subst ENST00000398934.subst ENST00000463807.subst ENST00000488791.subst ENST00000291539.subst ENST00000398938.subst ENST00000463902.subst ENST00000489072.subst ENST00000291547.subst ENST00000398948.subst ENST00000463917.subst ENST00000489175.subst ENST00000291552.subst ENST00000398956.subst ENST00000464037.subst ENST00000489201.subst ENST00000291554.subst ENST00000398960.subst ENST00000464058.subst ENST00000489261.subst ENST00000291560.subst ENST00000398998.subst ENST00000464078.subst ENST00000489319.subst ENST00000291565.subst ENST00000399000.subst ENST00000464215.subst ENST00000489469.subst ENST00000291568.subst ENST00000399001.subst ENST00000464256.subst ENST00000489661.subst ENST00000291572.subst ENST00000399010.subst ENST00000464265.subst ENST00000489676.subst ENST00000291574.subst ENST00000399017.subst ENST00000464357.subst ENST00000489903.subst ENST00000291576.subst ENST00000399098.subst ENST00000464435.subst ENST00000490032.subst ENST00000291577.subst ENST00000399102.subst ENST00000464589.subst ENST00000490091.subst ENST00000291582.subst ENST00000399103.subst ENST00000464750.subst ENST00000490184.subst ENST00000291592.subst ENST00000399120.subst ENST00000464778.subst ENST00000490220.subst ENST00000291634.subst ENST00000399135.subst ENST00000464867.subst ENST00000490358.subst ENST00000291670.subst ENST00000399136.subst ENST00000465077.subst ENST00000490393.subst ENST00000291672.subst ENST00000399137.subst ENST00000465143.subst ENST00000490450.subst ENST00000291688.subst ENST00000399139.subst ENST00000465326.subst ENST00000490468.subst ENST00000291691.subst ENST00000399151.subst ENST00000465356.subst ENST00000490479.subst ENST00000291700.subst ENST00000399191.subst ENST00000465532.subst ENST00000490666.subst ENST00000291705.subst ENST00000399207.subst ENST00000465574.subst ENST00000490714.subst ENST00000299295.subst ENST00000399208.subst ENST00000465732.subst ENST00000490803.subst ENST00000299340.subst ENST00000399215.subst ENST00000465834.subst ENST00000490860.subst ENST00000299443.subst ENST00000399240.subst ENST00000465905.subst ENST00000490982.subst ENST00000300255.subst ENST00000399272.subst ENST00000465955.subst ENST00000491110.subst ENST00000300258.subst ENST00000399284.subst ENST00000465968.subst ENST00000491183.subst ENST00000300260.subst ENST00000399286.subst ENST00000466122.subst ENST00000491298.subst ENST00000300278.subst ENST00000399289.subst ENST00000466134.subst ENST00000491389.subst ENST00000300305.subst ENST00000399292.subst ENST00000466285.subst ENST00000491395.subst ENST00000300481.subst ENST00000399295.subst ENST00000466328.subst ENST00000491486.subst ENST00000300482.subst ENST00000399299.subst ENST00000466434.subst ENST00000491513.subst ENST00000300527.subst ENST00000399312.subst ENST00000466453.subst ENST00000491559.subst ENST00000302347.subst ENST00000399338.subst ENST00000466472.subst ENST00000491564.subst ENST00000303071.subst ENST00000399349.subst ENST00000466474.subst ENST00000491625.subst ENST00000303113.subst ENST00000399352.subst ENST00000466639.subst ENST00000491666.subst ENST00000303645.subst ENST00000399353.subst ENST00000466746.subst ENST00000491703.subst ENST00000303775.subst ENST00000399355.subst ENST00000466787.subst ENST00000491729.subst ENST00000307301.subst ENST00000399367.subst ENST00000466846.subst ENST00000491776.subst ENST00000309117.subst ENST00000399442.subst ENST00000466861.subst ENST00000491794.subst ENST00000309434.subst ENST00000399784.subst ENST00000466882.subst ENST00000491828.subst ENST00000310645.subst ENST00000399804.subst ENST00000466954.subst ENST00000491838.subst ENST00000310826.subst ENST00000399889.subst ENST00000467026.subst ENST00000491927.subst ENST00000311124.subst ENST00000399899.subst ENST00000467074.subst ENST00000491952.subst ENST00000312957.subst ENST00000399907.subst ENST00000467112.subst ENST00000492229.subst ENST00000313806.subst ENST00000399909.subst ENST00000467162.subst ENST00000492275.subst ENST00000314103.subst ENST00000399913.subst ENST00000467280.subst ENST00000492280.subst ENST00000314399.subst ENST00000399914.subst ENST00000467315.subst ENST00000492336.subst ENST00000318948.subst ENST00000399921.subst ENST00000467358.subst ENST00000492414.subst ENST00000319294.subst ENST00000399925.subst ENST00000467403.subst ENST00000492514.subst ENST00000320216.subst ENST00000399926.subst ENST00000467445.subst ENST00000492600.subst ENST00000323084.subst ENST00000399928.subst ENST00000467510.subst ENST00000492603.subst ENST00000325223.subst ENST00000399934.subst ENST00000467565.subst ENST00000492638.subst ENST00000327505.subst ENST00000399935.subst ENST00000467575.subst ENST00000492656.subst ENST00000327574.subst ENST00000399947.subst ENST00000467577.subst ENST00000492742.subst ENST00000327783.subst ENST00000399975.subst ENST00000467616.subst ENST00000492760.subst ENST00000328264.subst ENST00000399976.subst ENST00000467692.subst ENST00000492833.subst ENST00000328619.subst ENST00000400043.subst ENST00000467731.subst ENST00000492851.subst ENST00000328656.subst ENST00000400075.subst ENST00000467818.subst ENST00000492864.subst ENST00000328735.subst ENST00000400087.subst ENST00000467908.subst ENST00000492930.subst ENST00000328862.subst ENST00000400090.subst ENST00000468009.subst ENST00000492962.subst ENST00000329122.subst ENST00000400093.subst ENST00000468016.subst ENST00000493019.subst ENST00000329319.subst ENST00000400094.subst ENST00000468039.subst ENST00000493196.subst ENST00000329553.subst ENST00000400099.subst ENST00000468059.subst ENST00000493295.subst ENST00000329621.subst ENST00000400127.subst ENST00000468090.subst ENST00000493464.subst ENST00000329623.subst ENST00000400128.subst ENST00000468349.subst ENST00000493503.subst ENST00000329667.subst ENST00000400131.subst ENST00000468360.subst ENST00000493524.subst ENST00000330205.subst ENST00000400135.subst ENST00000468392.subst ENST00000493640.subst ENST00000330317.subst ENST00000400165.subst ENST00000468429.subst ENST00000493753.subst ENST00000330333.subst ENST00000400166.subst ENST00000468474.subst ENST00000493811.subst ENST00000330714.subst ENST00000400169.subst ENST00000468506.subst ENST00000493883.subst ENST00000330798.subst ENST00000400183.subst ENST00000468508.subst ENST00000494243.subst ENST00000330938.subst ENST00000400199.subst ENST00000468643.subst ENST00000494252.subst ENST00000330942.subst ENST00000400202.subst ENST00000468717.subst ENST00000494296.subst ENST00000331343.subst ENST00000400211.subst ENST00000468726.subst ENST00000494310.subst ENST00000331573.subst ENST00000400274.subst ENST00000468788.subst ENST00000494435.subst ENST00000331923.subst ENST00000400304.subst ENST00000468805.subst ENST00000494498.subst ENST00000332131.subst ENST00000400305.subst ENST00000468864.subst ENST00000494690.subst ENST00000332149.subst ENST00000400308.subst ENST00000468874.subst ENST00000494755.subst ENST00000332378.subst ENST00000400309.subst ENST00000468924.subst ENST00000494829.subst ENST00000332512.subst ENST00000400310.subst ENST00000468982.subst ENST00000495005.subst ENST00000332859.subst ENST00000400314.subst ENST00000469079.subst ENST00000495007.subst ENST00000333229.subst ENST00000400337.subst ENST00000469087.subst ENST00000495055.subst ENST00000333337.subst ENST00000400365.subst ENST00000469240.subst ENST00000495217.subst ENST00000333634.subst ENST00000400368.subst ENST00000469393.subst ENST00000495240.subst ENST00000333781.subst ENST00000400372.subst ENST00000469395.subst ENST00000495274.subst ENST00000333892.subst ENST00000400374.subst ENST00000469412.subst ENST00000495343.subst ENST00000334046.subst ENST00000400375.subst ENST00000469482.subst ENST00000495344.subst ENST00000334055.subst ENST00000400377.subst ENST00000469521.subst ENST00000495363.subst ENST00000334058.subst ENST00000400379.subst ENST00000469658.subst ENST00000495475.subst ENST00000334063.subst ENST00000400421.subst ENST00000469939.subst ENST00000495521.subst ENST00000334067.subst ENST00000400423.subst ENST00000470029.subst ENST00000495656.subst ENST00000334068.subst ENST00000400424.subst ENST00000470108.subst ENST00000495858.subst ENST00000334151.subst ENST00000400427.subst ENST00000470196.subst ENST00000495892.subst ENST00000334160.subst ENST00000400454.subst ENST00000470450.subst ENST00000496044.subst ENST00000334352.subst ENST00000400477.subst ENST00000470533.subst ENST00000496121.subst ENST00000334494.subst ENST00000400485.subst ENST00000470545.subst ENST00000496124.subst ENST00000334538.subst ENST00000400532.subst ENST00000470586.subst ENST00000496321.subst ENST00000334662.subst ENST00000400546.subst ENST00000470658.subst ENST00000496395.subst ENST00000334664.subst ENST00000400558.subst ENST00000470682.subst ENST00000496416.subst ENST00000334670.subst ENST00000400559.subst ENST00000470742.subst ENST00000496462.subst ENST00000334680.subst ENST00000400562.subst ENST00000470800.subst ENST00000496485.subst ENST00000334849.subst ENST00000400564.subst ENST00000470864.subst ENST00000496601.subst ENST00000334897.subst ENST00000400566.subst ENST00000470886.subst ENST00000496607.subst ENST00000335093.subst ENST00000400577.subst ENST00000470912.subst ENST00000496615.subst ENST00000335440.subst ENST00000401402.subst ENST00000470944.subst ENST00000496664.subst ENST00000335512.subst ENST00000402202.subst ENST00000470987.subst ENST00000496759.subst ENST00000336648.subst ENST00000404019.subst ENST00000471250.subst ENST00000496774.subst ENST00000337385.subst ENST00000404220.subst ENST00000471260.subst ENST00000496779.subst ENST00000337909.subst ENST00000405436.subst ENST00000471269.subst ENST00000496783.subst ENST00000338326.subst ENST00000407780.subst ENST00000471277.subst ENST00000496824.subst ENST00000338785.subst ENST00000408910.subst ENST00000471468.subst ENST00000497243.subst ENST00000339024.subst ENST00000408989.subst ENST00000471490.subst ENST00000497313.subst ENST00000339195.subst ENST00000409416.subst ENST00000471540.subst ENST00000497493.subst ENST00000339659.subst ENST00000410005.subst ENST00000471689.subst ENST00000497547.subst ENST00000339775.subst ENST00000411496.subst ENST00000471860.subst ENST00000497630.subst ENST00000339818.subst ENST00000411566.subst ENST00000471909.subst ENST00000497664.subst ENST00000339944.subst ENST00000411651.subst ENST00000472184.subst ENST00000497805.subst ENST00000340344.subst ENST00000411828.subst ENST00000472191.subst ENST00000497833.subst ENST00000340345.subst ENST00000412604.subst ENST00000472272.subst ENST00000497873.subst ENST00000340648.subst ENST00000413017.subst ENST00000472318.subst ENST00000497881.subst ENST00000341322.subst ENST00000413758.subst ENST00000472364.subst ENST00000497909.subst ENST00000341618.subst ENST00000413778.subst ENST00000472398.subst ENST00000498040.subst ENST00000342101.subst ENST00000413881.subst ENST00000472401.subst ENST00000498121.subst ENST00000342108.subst ENST00000414079.subst ENST00000472429.subst ENST00000498151.subst ENST00000342136.subst ENST00000415023.subst ENST00000472548.subst ENST00000498210.subst ENST00000342220.subst ENST00000415847.subst ENST00000472557.subst ENST00000498351.subst ENST00000342449.subst ENST00000415997.subst ENST00000472587.subst ENST00000498355.subst ENST00000343118.subst ENST00000416044.subst ENST00000472588.subst ENST00000498371.subst ENST00000343528.subst ENST00000416357.subst ENST00000472602.subst ENST00000498430.subst ENST00000343687.subst ENST00000417007.subst ENST00000472607.subst ENST00000498614.subst ENST00000344330.subst ENST00000417060.subst ENST00000472777.subst ENST00000498666.subst ENST00000344577.subst ENST00000417133.subst ENST00000473102.subst ENST00000498670.subst ENST00000344691.subst ENST00000417181.subst ENST00000473107.subst ENST00000498789.subst ENST00000345496.subst ENST00000417564.subst ENST00000473212.subst ENST00000498799.subst ENST00000346798.subst ENST00000417871.subst ENST00000473381.subst ENST00000498841.subst ENST00000347667.subst ENST00000417954.subst ENST00000473752.subst ENST00000517777.subst ENST00000347800.subst ENST00000418301.subst ENST00000473813.subst ENST00000518033.subst ENST00000348354.subst ENST00000418336.subst ENST00000473988.subst ENST00000518236.subst ENST00000348499.subst ENST00000418394.subst ENST00000474114.subst ENST00000518498.subst ENST00000348831.subst ENST00000418766.subst ENST00000474132.subst ENST00000520389.subst ENST00000348990.subst ENST00000418933.subst ENST00000474136.subst ENST00000521987.subst ENST00000349048.subst ENST00000419093.subst ENST00000474272.subst ENST00000521995.subst ENST00000349112.subst ENST00000419219.subst ENST00000474319.subst ENST00000522411.subst ENST00000349485.subst ENST00000419241.subst ENST00000474336.subst ENST00000522931.subst ENST00000349499.subst ENST00000419378.subst ENST00000474355.subst ENST00000523126.subst ENST00000351097.subst ENST00000419699.subst ENST00000474368.subst ENST00000523323.subst ENST00000351429.subst ENST00000419868.subst ENST00000474455.subst ENST00000524251.subst ENST00000352133.subst ENST00000420068.subst ENST00000474596.subst ENST00000527919.subst ENST00000352178.subst ENST00000420072.subst ENST00000474735.subst ENST00000530812.subst ENST00000352483.subst ENST00000420455.subst ENST00000474737.subst ENST00000530908.subst ENST00000352957.subst ENST00000420666.subst ENST00000474775.subst ENST00000535441.subst ENST00000354192.subst ENST00000421049.subst ENST00000474835.subst ENST00000536776.subst ENST00000354250.subst ENST00000421541.subst ENST00000475009.subst ENST00000536861.subst ENST00000354749.subst ENST00000421802.subst ENST00000475047.subst ENST00000540756.subst ENST00000354828.subst ENST00000422809.subst ENST00000475072.subst ENST00000540844.subst ENST00000355153.subst ENST00000422875.subst ENST00000475170.subst ENST00000541036.subst ENST00000355459.subst ENST00000422891.subst ENST00000475205.subst ENST00000542230.subst ENST00000355480.subst ENST00000422911.subst ENST00000475297.subst ENST00000543733.subst ENST00000355666.subst ENST00000423045.subst ENST00000475344.subst ENST00000545369.subst ENST00000355680.subst ENST00000423206.subst ENST00000475402.subst ENST00000545939.subst ENST00000356275.subst ENST00000423214.subst ENST00000475422.subst ENST00000546158.subst ENST00000356396.subst ENST00000423596.subst ENST00000475534.subst ENST00000546469.subst ENST00000356577.subst ENST00000424203.subst ENST00000475618.subst ENST00000546482.subst ENST00000357345.subst ENST00000424365.subst ENST00000475639.subst ENST00000547141.subst ENST00000357704.subst ENST00000425336.subst ENST00000475776.subst ENST00000547201.subst ENST00000357903.subst ENST00000426537.subst ENST00000475838.subst ENST00000547657.subst ENST00000357985.subst ENST00000426783.subst ENST00000475864.subst ENST00000548219.subst ENST00000358268.subst ENST00000426935.subst ENST00000476084.subst ENST00000548467.subst ENST00000358356.subst ENST00000426947.subst ENST00000476106.subst ENST00000548570.subst ENST00000358918.subst ENST00000427445.subst ENST00000476313.subst ENST00000549362.subst ENST00000359568.subst ENST00000427464.subst ENST00000476326.subst ENST00000549948.subst ENST00000359624.subst ENST00000427746.subst ENST00000476524.subst ENST00000550131.subst ENST00000359726.subst ENST00000427803.subst ENST00000476653.subst ENST00000551367.subst ENST00000359759.subst ENST00000428240.subst ENST00000476784.subst ENST00000551788.subst ENST00000360214.subst ENST00000428693.subst ENST00000476848.subst ENST00000552581.subst ENST00000360525.subst ENST00000429064.subst ENST00000476914.subst ENST00000553001.subst ENST00000360542.subst ENST00000429093.subst ENST00000476948.subst ENST00000557820.subst ENST00000360697.subst ENST00000429727.subst ENST00000476950.subst ENST00000558955.subst ENST00000360731.subst ENST00000429827.subst ENST00000477091.subst ENST00000560448.subst ENST00000360770.subst ENST00000430013.subst ENST00000477351.subst ENST00000567670.subst ENST00000360938.subst ENST00000430093.subst ENST00000477419.subst ENST00000593338.subst ENST00000361093.subst ENST00000430354.subst ENST00000477633.subst ENST00000594149.subst ENST00000361335.subst ENST00000430874.subst ENST00000477954.subst ENST00000594320.subst ENST00000361371.subst ENST00000431166.subst ENST00000478035.subst ENST00000599962.subst ENST00000361534.subst ENST00000431229.subst ENST00000478105.subst ENST00000607049.subst ENST00000361567.subst ENST00000431234.subst ENST00000478183.subst ENST00000607150.subst ENST00000361624.subst ENST00000431254.subst ENST00000478200.subst ENST00000608928.subst ENST00000361681.subst ENST00000431390.subst ENST00000478268.subst ENST00000609325.subst ENST00000361739.subst ENST00000431599.subst ENST00000478273.subst ENST00000609664.subst ENST00000361802.subst ENST00000431628.subst ENST00000478282.subst ENST00000609713.subst ENST00000361851.subst ENST00000432085.subst ENST00000478313.subst ENST00000609985.subst ENST00000361866.subst ENST00000432178.subst ENST00000478372.subst ENST00000610200.subst ENST00000361899.subst ENST00000432231.subst ENST00000478426.subst ENST00000610622.subst ENST00000366093.subst ENST00000432378.subst ENST00000478613.subst ENST00000610664.subst ENST00000367071.subst ENST00000432907.subst ENST00000478674.subst ENST00000611195.subst ENST00000379960.subst ENST00000433067.subst ENST00000478680.subst ENST00000611936.subst ENST00000380010.subst ENST00000433931.subst ENST00000478709.subst ENST00000612273.subst ENST00000380095.subst ENST00000433957.subst ENST00000478932.subst ENST00000612277.subst ENST00000380221.subst ENST00000434667.subst ENST00000479117.subst ENST00000612472.subst ENST00000380276.subst ENST00000435323.subst ENST00000479127.subst ENST00000612610.subst ENST00000380328.subst ENST00000435722.subst ENST00000479152.subst ENST00000612624.subst ENST00000380486.subst ENST00000435732.subst ENST00000479153.subst ENST00000612702.subst ENST00000380588.subst ENST00000436227.subst ENST00000479202.subst ENST00000612746.subst ENST00000380618.subst ENST00000436324.subst ENST00000479325.subst ENST00000613245.subst ENST00000380620.subst ENST00000436357.subst ENST00000479378.subst ENST00000613488.subst ENST00000380631.subst ENST00000437126.subst ENST00000479424.subst ENST00000613499.subst ENST00000380634.subst ENST00000437180.subst ENST00000479429.subst ENST00000613611.subst ENST00000380637.subst ENST00000437338.subst ENST00000479548.subst ENST00000614229.subst ENST00000380671.subst ENST00000437395.subst ENST00000479557.subst ENST00000614538.subst ENST00000380708.subst ENST00000437442.subst ENST00000479586.subst ENST00000614657.subst ENST00000380747.subst ENST00000437626.subst ENST00000479654.subst ENST00000614763.subst ENST00000380748.subst ENST00000437996.subst ENST00000479810.subst ENST00000614971.subst ENST00000380749.subst ENST00000438837.subst ENST00000479849.subst ENST00000615172.subst ENST00000380800.subst ENST00000438952.subst ENST00000479930.subst ENST00000615480.subst ENST00000380900.subst ENST00000439107.subst ENST00000479964.subst ENST00000616529.subst ENST00000381132.subst ENST00000439213.subst ENST00000480147.subst ENST00000616689.subst ENST00000381135.subst ENST00000439274.subst ENST00000480179.subst ENST00000617313.subst ENST00000381151.subst ENST00000439427.subst ENST00000480196.subst ENST00000617706.subst ENST00000381284.subst ENST00000439593.subst ENST00000480234.subst ENST00000617716.subst ENST00000381285.subst ENST00000440086.subst ENST00000480359.subst ENST00000617870.subst ENST00000381291.subst ENST00000440126.subst ENST00000480452.subst ENST00000618007.subst ENST00000381318.subst ENST00000440288.subst ENST00000480456.subst ENST00000618024.subst ENST00000381540.subst ENST00000440526.subst ENST00000480486.subst ENST00000618699.subst ENST00000381554.subst ENST00000440794.subst ENST00000480553.subst ENST00000618832.subst ENST00000381679.subst ENST00000440810.subst ENST00000480612.subst ENST00000619120.subst ENST00000381692.subst ENST00000440966.subst ENST00000480690.subst ENST00000619249.subst ENST00000381815.subst ENST00000441030.subst ENST00000480786.subst ENST00000619537.subst ENST00000381831.subst ENST00000441128.subst ENST00000480893.subst ENST00000619610.subst ENST00000381839.subst ENST00000441403.subst ENST00000480896.subst ENST00000619682.subst ENST00000381947.subst ENST00000441787.subst ENST00000480950.subst ENST00000619874.subst ENST00000381995.subst ENST00000441940.subst ENST00000481022.subst ENST00000620015.subst ENST00000382238.subst ENST00000442071.subst ENST00000481059.subst ENST00000620065.subst ENST00000382264.subst ENST00000442441.subst ENST00000481113.subst ENST00000620117.subst ENST00000382348.subst ENST00000442448.subst ENST00000481185.subst ENST00000620442.subst ENST00000382357.subst ENST00000442660.subst ENST00000481302.subst ENST00000620481.subst ENST00000382373.subst ENST00000443046.subst ENST00000481319.subst ENST00000620528.subst ENST00000382491.subst ENST00000443073.subst ENST00000481411.subst ENST00000620920.subst ENST00000382499.subst ENST00000443408.subst ENST00000481448.subst ENST00000621064.subst ENST00000382549.subst ENST00000443703.subst ENST00000481458.subst ENST00000621162.subst ENST00000382699.subst ENST00000443785.subst ENST00000481460.subst ENST00000621201.subst ENST00000382751.subst ENST00000444335.subst ENST00000481477.subst ENST00000621478.subst ENST00000382822.subst ENST00000444517.subst ENST00000481512.subst ENST00000621601.subst ENST00000382826.subst ENST00000445049.subst ENST00000481546.subst ENST00000622113.subst ENST00000382828.subst ENST00000445245.subst ENST00000481605.subst ENST00000622352.subst ENST00000382830.subst ENST00000445393.subst ENST00000481609.subst ENST00000622690.subst ENST00000382835.subst ENST00000445582.subst ENST00000481638.subst ENST00000622914.subst ENST00000389124.subst ENST00000445668.subst ENST00000481710.subst ENST00000622915.subst ENST00000389125.subst ENST00000445724.subst ENST00000481838.subst ENST00000622934.subst ENST00000389194.subst ENST00000446405.subst ENST00000481861.subst ENST00000623011.subst ENST00000389195.subst ENST00000446924.subst ENST00000481883.subst ENST00000623375.subst ENST00000389690.subst ENST00000447016.subst ENST00000481921.subst ENST00000623390.subst ENST00000389861.subst ENST00000447177.subst ENST00000482032.subst ENST00000623476.subst ENST00000389863.subst ENST00000447207.subst ENST00000482084.subst ENST00000623661.subst ENST00000389995.subst ENST00000447939.subst ENST00000482186.subst ENST00000623703.subst ENST00000390689.subst ENST00000447980.subst ENST00000482192.subst ENST00000623744.subst ENST00000390690.subst ENST00000448850.subst ENST00000482273.subst ENST00000623795.subst ENST00000391617.subst ENST00000449165.subst ENST00000482318.subst ENST00000623803.subst ENST00000391618.subst ENST00000449395.subst ENST00000482508.subst ENST00000623810.subst ENST00000391620.subst ENST00000449622.subst ENST00000482533.subst ENST00000623903.subst ENST00000391621.subst ENST00000449640.subst ENST00000482575.subst ENST00000623939.subst ENST00000391624.subst ENST00000450351.subst ENST00000482663.subst ENST00000623960.subst ENST00000397628.subst ENST00000450895.subst ENST00000482679.subst ENST00000623998.subst ENST00000397637.subst ENST00000451065.subst ENST00000482733.subst ENST00000624019.subst ENST00000397638.subst ENST00000451211.subst ENST00000482761.subst ENST00000624077.subst ENST00000397648.subst ENST00000451248.subst ENST00000482775.subst ENST00000624081.subst ENST00000397679.subst ENST00000451489.subst ENST00000482915.subst ENST00000624120.subst ENST00000397680.subst ENST00000452420.subst ENST00000482953.subst ENST00000624304.subst ENST00000397682.subst ENST00000452550.subst ENST00000483178.subst ENST00000624312.subst ENST00000397683.subst ENST00000453032.subst ENST00000483315.subst ENST00000624406.subst ENST00000397691.subst ENST00000453553.subst ENST00000483326.subst ENST00000624445.subst ENST00000397692.subst ENST00000453626.subst ENST00000483568.subst ENST00000624534.subst ENST00000397694.subst ENST00000454499.subst ENST00000483844.subst ENST00000624569.subst ENST00000397701.subst ENST00000454800.subst ENST00000483896.subst ENST00000624648.subst ENST00000397708.subst ENST00000455097.subst ENST00000483973.subst ENST00000624691.subst ENST00000397728.subst ENST00000455164.subst ENST00000483977.subst ENST00000624714.subst ENST00000397743.subst ENST00000455177.subst ENST00000484028.subst ENST00000624739.subst ENST00000397746.subst ENST00000455528.subst ENST00000484047.subst ENST00000624748.subst ENST00000397748.subst ENST00000455571.subst ENST00000484090.subst ENST00000624758.subst ENST00000397763.subst ENST00000456489.subst ENST00000484174.subst ENST00000624808.subst ENST00000397826.subst ENST00000456957.subst ENST00000484294.subst ENST00000624901.subst ENST00000397846.subst ENST00000457143.subst ENST00000484377.subst ENST00000624921.subst ENST00000397850.subst ENST00000457208.subst ENST00000484403.subst ENST00000624932.subst ENST00000397852.subst ENST00000457359.subst ENST00000484465.subst ENST00000624934.subst ENST00000397854.subst ENST00000457807.subst ENST00000484627.subst ENST00000626972.subst ENST00000397857.subst ENST00000457828.subst ENST00000484712.subst ENST00000628044.subst ENST00000397886.subst ENST00000457905.subst ENST00000484808.subst ENST00000628202.subst ENST00000397887.subst ENST00000457956.subst ENST00000484861.subst ENST00000628776.subst ENST00000397893.subst ENST00000458223.subst ENST00000484865.subst ENST00000629643.subst ENST00000397898.subst ENST00000458295.subst ENST00000484878.subst ENST00000630077.subst ENST00000397907.subst ENST00000458356.subst ENST00000484887.subst ENST00000631642.subst ENST00000397911.subst ENST00000458387.subst ENST00000484983.subst ENST00000632537.subst ENST00000397916.subst ENST00000459639.subst ENST00000485067.subst ENST00000632881.subst ENST00000397928.subst ENST00000459741.subst ENST00000485190.subst ENST00000633442.subst ENST00000397932.subst ENST00000459833.subst ENST00000485207.subst ENST00000633593.subst ENST00000397956.subst ENST00000459895.subst ENST00000485272.subst ENST00000634020.subst ENST00000397961.subst ENST00000459909.subst ENST00000485299.subst ENST00000634021.subst ENST00000397994.subst ENST00000459922.subst ENST00000485357.subst ENST00000634453.subst ENST00000398058.subst ENST00000459939.subst ENST00000485402.subst ENST00000634718.subst ENST00000398061.subst ENST00000460011.subst ENST00000485488.subst ENST00000635108.subst ENST00000398063.subst ENST00000460020.subst ENST00000485493.subst ENST00000635189.subst ENST00000398078.subst ENST00000460174.subst ENST00000485550.subst ENST00000635325.subst ENST00000398081.subst ENST00000460207.subst ENST00000485591.subst

and I am curretnly run sift4g on my docker to see if the error reproducible as last night.

I have no name!@bc8b48e0e31e:/$ ./sift4g/bin/sift4g -d /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta -q /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta --subst /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst --out /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/singleRecords_with_scores --sub-results Checking query data and substitutions files

Searching database for candidate sequences

Afei99357 commented 7 months ago

Hello! I got the error after i run the sift4g separately, any idea where this error from? memory issue?

I have no name!@bc8b48e0e31e:/$ ./sift4g/bin/sift4g -d /Users/ericliao/Desktop/WNV_project_files/sift_docker/bigdrive/SIFT_databases/uniref90.fasta -q /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta --subst /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst --out /Users/ericliao/Desktop/WNV_project_files/sift_docker/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/singleRecords_with_scores --sub-results Checking query data and substitutions files

Searching database for candidate sequences Killedessing database part 275 (size ~0.25 GB): 100.00/100.00% *

pauline-ng commented 7 months ago

Looks like you ran out of memory

Afei99357 commented 7 months ago

Thank you so much. I realize it is the memory issue, then I raise the memory limit for docker to 14 gb out of 16gb total. But it ends with crashing at different step. still no idea what causes this crash..

here is the terminal output:

I have no name!@1945c502387a:/sift4g_run_test/SIFT4G_Create_Genomic_DB$ perl make-SIFT-db-all.pl -config /sift4g_run_test/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens-test.txt entered mkdir /sift4g_run_test/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/GRCh38.83 converting gene format to use-able input done converting gene format making single records file done making single records template making noncoding records file done making noncoding records make the fasta sequences done making the fasta sequences start siftsharp, getting the alignments /sift4g/bin/sift4g -d /sift4g_run_test/bigdrive/SIFT_databases/uniref90.fasta -q /sift4g_run_test/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/all_prot.fasta --subst /sift4g_run_test/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/subst --out /sift4g_run_test/SIFT4G_Create_Genomic_DB/test_files/homo_sapiens_small/SIFT_predictions --sub-results Checking query data and substitutions files

Searching database for candidate sequences

Aligning queries with candidate sequences

pauline-ng commented 7 months ago

There is no error message here.

Afei99357 commented 7 months ago

Yes. That is the problem as the very beginning. It just stopped without any error... is it potentially another memory? how much memory does it need to run the test files?? I can only give 14 gb to the docker.. I am running my own files on a more powerful machine now, hopefully it is the memory issue?

It will be nice if they give some errors if the processing suddenly stopped?