Closed PrincyJohnson closed 2 years ago
Full paths must be used in the config files. Please change all the settings in the config file to full paths.
Hello Pauline,
I am using full paths for all the files. It is downloading the files but it says can't access the gene-annotation-src file candidatus_carsonella_ruddii_pv_config.txt
(base) princy_raagul@AG053360D-07006:/mnt/c/new/scripts_to_build_SIFT_db$ perl make-SIFT-db-all.pl -config test_files101/candidatus_carsonella_ruddii_pv_config.txt --ensembl_download entered mkdir /mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100 /ASM1036v1.34 /mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100 downloading gene annotation /gene-annotation-src: Scheme missing. --2022-07-08 15:14:45-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/gtf//bacteria_11_collection/candidatus_carsonella_ruddii_pv/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz’ Resolving ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)... 193.62.193.141 Connecting to ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)|193.62.193.141|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/bacteria/release-34/gtf//bacteria_11_collection/candidatus_carsonella_ruddii_pv ... done. ==> SIZE Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz ... 14570 ==> PASV ... done. ==> RETR Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz ... done. Length: 14570 (14K) (unauthoritative)
Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz 100%[========================================================================================================================================>] 14.23K --.-KB/s in 0.001s
2022-07-08 15:14:47 (16.1 MB/s) - ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz’ saved [14570]
FINISHED --2022-07-08 15:14:47-- Total wall clock time: 1.9s Downloaded: 1 files, 14K in 0.001s (16.1 MB/s) /gene-annotation-src: Scheme missing. --2022-07-08 15:14:47-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/pep/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.pep.all.fa.gz => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100/Candidatus_carsonella_ruddii_pv.ASM1036v1.34.pep.all.fa.gz’ Resolving ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)... 193.62.193.141 Connecting to ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)|193.62.193.141|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/pep ... done. ==> SIZE Candidatus_carsonella_ruddii_pv.ASM1036v1.34.pep.all.fa.gz ... done.
==> PASV ... done. ==> RETR Candidatus_carsonella_ruddii_pv.ASM1036v1.34.pep.all.fa.gz ... No such file ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.34.pep.all.fa.gz’.
done downloading gene annotation downloading fasta files --2022-07-08 15:14:49-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/%0D => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’ Resolving ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)... 193.62.193.141 Connecting to ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)|193.62.193.141|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna ... done. ==> PASV ... done. ==> LIST ... done.
.listing [ <=> ] 1009 --.-KB/s in 0s
2022-07-08 15:14:50 (33.4 MB/s) - ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’ saved [1009]
Removed ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’. Rejecting ‘CHECKSUMS’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna.toplevel.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_rm.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_rm.toplevel.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_sm.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_sm.toplevel.fa.gz’. Rejecting ‘README’. --2022-07-08 15:14:50-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/%0D => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/%0D’ ==> CWD not required. ==> SIZE \r ... done.
==> PASV ... done. ==> RETR \r ... No such file ‘\r’.
--2022-07-08 15:14:50-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/%0D => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’ Resolving ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)... 193.62.193.141 Connecting to ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)|193.62.193.141|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna ... done. ==> PASV ... done. ==> LIST ... done.
.listing [ <=> ] 1009 --.-KB/s in 0s
2022-07-08 15:14:52 (37.6 MB/s) - ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’ saved [1009]
Removed ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’. Rejecting ‘CHECKSUMS’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna.toplevel.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_rm.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_rm.toplevel.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_sm.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_sm.toplevel.fa.gz’. Rejecting ‘README’. --2022-07-08 15:14:52-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/%0D => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/%0D’ ==> CWD not required. ==> SIZE \r ... done.
==> PASV ... done. ==> RETR \r ... No such file ‘\r’.
--2022-07-08 15:14:52-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/%0D => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’ Resolving ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)... 193.62.193.141 Connecting to ftp.ensemblgenomes.org (ftp.ensemblgenomes.org)|193.62.193.141|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna ... done. ==> PASV ... done. ==> LIST ... done.
.listing [ <=> ] 1009 --.-KB/s in 0s
2022-07-08 15:14:53 (45.6 MB/s) - ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’ saved [1009]
Removed ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/.listing’. Rejecting ‘CHECKSUMS’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna.toplevel.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_rm.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_rm.toplevel.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_sm.chromosome.Chromosome.fa.gz’. Rejecting ‘Candidatus_carsonella_ruddii_pv.ASM1036v1.dna_sm.toplevel.fa.gz’. Rejecting ‘README’. --2022-07-08 15:14:53-- ftp://ftp.ensemblgenomes.org/pub/bacteria/release-34/fasta//bacteria_11_collection/candidatus_carsonella_ruddii_pv/dna/%0D => ‘/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/%0D’ ==> CWD not required. ==> SIZE \r ... done.
==> PASV ... done. ==> RETR \r ... No such file ‘\r’.
done downloading DNA fasta sequencesdownload dbSNP files Use of uninitialized value $src_site in concatenation (.) or string at download-dbSNP-files.pl line 55. Use of uninitialized value $src_site in concatenation (.) or string at download-dbSNP-files.pl line 60. /dbSNP: Scheme missing. Use of uninitialized value $src_site in concatenation (.) or string at download-dbSNP-files.pl line 64. converting gene format to use-able input ls: cannot access '/gene-annotation-src': No such file or directory gzip: /mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100 is a directory -- ignored gzip: /gene-annotation-src.gz: No such file or directory gzip: /Candidatus_carsonella_ruddii_pv.ASM1036v1.34.gtf.gz: No such file or directory done converting gene format /*.gz: No such file or directoryd_SIFT_db/test_files101/candidas100 DNA files do not exist or did not unzip properly
Are you working in Unix or a Mac? The bioinformatics tools were developed and tested in Unix. Because I'm seeing some special characters in your config file like "\r" and "%0D" in:
/mnt/c/new/scripts_to_build_SIFT_db/test_files101/candidas100\r/chr-src\r/%0D’
/candidatus_carsonella_ruddii_pv/dna/%0D
The other thing is -- your paths aren't what I expected. I'd expect all your full paths to start with /mnt or /home You have to have write permission to the folders in the config file, and unless you're running as root, I don't think you'd have permissions to create the folders in the config file (but I'm not familiar with Mac).
Hi pauline,
I managed to clear that error and I ran for homo sapiens 21 tutorial. I getting this error. I can see sift predictions in the output file. But the database folder doesn't have the id.regions file. can you please let me know what it is.
Hi Princy,
You need to have python (v3) installed.
-Pauline
I actually got that pauline.
Thank you so much for your response.
Hello,
I tried with the test files. It's not working either.
princy_raagul@AG053360D-07006:/mnt/c/Users/josephine.p.johnson/Documents/Variant_dataset/SIFT/scripts_to_build_SIFT_db$ perl make-SIFT-db-all.pl -config test_files101/homo_sapiens-test.txt entered mkdir ./test_files101/homo_sapiens_small/ /GRCh38.83dir ./test_files101/homo_sapiens_small/ converting gene format to use-able input ls: cannot access '/gene-annotation-src': No such file or directory Unable to open for reading done converting gene format /*.gz: No such file or directoryns_small/ DNA files do not exist or did not unzip properly
homo_sapiens-test.txt ot unzip properly