fiuzatayna / epitocore

1 stars 0 forks source link

download error #1

Open aslangabriel99 opened 1 year ago

aslangabriel99 commented 1 year ago

I have searched and download 38 proteomes from unipro database, which can't be used directly for this tool. And I found the error occurred after I edit the command_sequence.sh and add the species information, most of the sequence were downloaded failured. Such as "Unexpected error with download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/018/972/085/GCA_018972085.1_ASM1897208v1/GCA_018972085.1_ASM1897208v1_protein.faa.gz "

fiuzatayna commented 1 year ago

Looking into it aslangabriel99

Em seg., 24 de abr. de 2023 às 11:09, aslangabriel99 < @.***> escreveu:

I have searched and download 38 proteomes from unipro database, which can't be used directly for this tool. And I found the error occurred after I edit the command_sequence.sh and add the species information, most of the sequence were downloaded failured. Such as "Unexpected error with download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/018/972/085/GCA_018972085.1_ASM1897208v1/GCA_018972085.1_ASM1897208v1_protein.faa.gz "

— Reply to this email directly, view it on GitHub https://github.com/fiuzatayna/epitocore/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG7S654SFBKDEI3YK3Q4AJDXC2CSFANCNFSM6AAAAAAXJUQXCM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

fiuzatayna commented 1 year ago

Hello aslangabriel99, I successfully downloaded 31 proteomes from ncbi using the query "Riemerella anatipestifer" on the command_sequence.sh file (echo "Riemerella anatipestifer" > species_file), see below.

Is your internet connection stable? Could you please try again and let me know if it works?

@.*:/home# ./command_sequence.sh Start EpitoCore? [PRESS ENTER] How many CPUs may be used? 20 GET PROTEOME ** Getting paths of proteins from Riemerella anatipestifer to download.

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/051/075/GCA_001051075.1_ASM105107v1/GCA_001051075.1_ASM105107v1_protein.faa.gz

100% [............................................................................] 359444 / 359444 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/051/115/GCA_001051115.1_ASM105111v1/GCA_001051115.1_ASM105111v1_protein.faa.gz

100% [............................................................................] 392355 / 392355 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/670/765/GCA_001670765.2_ASM167076v2/GCA_001670765.2_ASM167076v2_protein.faa.gz

100% [............................................................................] 448469 / 448469 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/002/025/185/GCA_002025185.1_ASM202518v1/GCA_002025185.1_ASM202518v1_protein.faa.gz

100% [............................................................................] 463614 / 463614 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/006/385/095/GCA_006385095.1_ASM638509v1/GCA_006385095.1_ASM638509v1_protein.faa.gz

100% [............................................................................] 419000 / 419000 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/009/496/935/GCA_009496935.1_ASM949693v1/GCA_009496935.1_ASM949693v1_protein.faa.gz

Unexpected error with download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/009/496/935/GCA_009496935.1_ASM949693v1/GCA_009496935.1_ASM949693v1_protein.faa.gz

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/015/291/805/GCA_015291805.1_ASM1529180v1/GCA_015291805.1_ASM1529180v1_protein.faa.gz

100% [............................................................................] 426150 / 426150 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/018/972/085/GCA_018972085.1_ASM1897208v1/GCA_018972085.1_ASM1897208v1_protein.faa.gz

100% [............................................................................] 450689 / 450689 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/186/945/GCA_900186945.1_48903_E01/GCA_900186945.1_48903_E01_protein.faa.gz

100% [............................................................................] 432787 / 432787 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/183/155/GCA_000183155.1_ASM18315v1/GCA_000183155.1_ASM18315v1_protein.faa.gz

100% [............................................................................] 427202 / 427202 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/252/855/GCA_000252855.1_ASM25285v1/GCA_000252855.1_ASM25285v1_protein.faa.gz

100% [............................................................................] 410779 / 410779 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/191/565/GCA_000191565.1_ASM19156v1/GCA_000191565.1_ASM19156v1_protein.faa.gz

100% [............................................................................] 418562 / 418562 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/295/655/GCA_000295655.1_ASM29565v1/GCA_000295655.1_ASM29565v1_protein.faa.gz

100% [............................................................................] 451110 / 451110 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/331/695/GCA_000331695.1_ASM33169v1/GCA_000331695.1_ASM33169v1_protein.faa.gz

100% [............................................................................] 425941 / 425941 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/734/055/GCA_000734055.1_ASM73405v1/GCA_000734055.1_ASM73405v1_protein.faa.gz

100% [............................................................................] 443577 / 443577 Downloaded

Trying to download ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/077/795/GCA_001077795.1_ASM107779v1/GCA_001077795.1_ASM107779v1_protein.faa.gz

100% [............................................................................] 426503 / 426503 Downloaded

Decompressing get_proteome_output//Riemerella_anatipestifer_CH3_strain=CH3_GCA_000734055.1_ASM73405v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=WJ4_GCA_006385095.1_ASM638509v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_RA-CH-2_strain=RA-CH-2_GCA_000331695.1_ASM33169v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=HXb2_GCA_002025185.1_ASM202518v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=153_GCA_001051115.1_ASM105111v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=XG19_GCA_018972085.1_ASM1897208v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=RCAD0133_GCA_001670765.2_ASM167076v2_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=17_GCA_001051075.1_ASM105107v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_RA-CH-1_strain=RA-CH-1_GCA_000295655.1_ASM29565v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_Yb2_strain=Yb2_GCA_001077795.1_ASM107779v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_ATCC11845=_DSM_15868_strain=DSM_15868_GCA_000183155.1_ASM18315v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_RA-GD_strain=RA-GD_GCA_000191565.1_ASM19156v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=RCAD0392_GCA_015291805.1_ASM1529180v1_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_strain=NCTC11014_GCA_900186945.1_48903_E01_protein.faa.gz

Decompressing get_proteome_output//Riemerella_anatipestifer_ATCC11845=_DSM_15868_strain=ATCC_11845_GCA_000252855.1_ASM25285v1_protein.faa.gz

Message ID: @.***>

aslangabriel99 commented 1 year ago

Thanks for your help. I have another problem about the setting of parameters " IEDB consensus percentile rank thresholds Minimal number of strains in which a candidate peptide must be found Name of CMG Biotools clustering file", would you like to give me some rule of thumb? By the way, I want to design vaccines for animals such as ducks, so how should I test the epitope with IEDB which was built for humans?

fiuzatayna commented 1 year ago

For now, as it uses human MHC alleles it is not suitable for your needs. I will look into duck MHC polymorphism and structure to see if the pipeline could be adjusted.