yeastgenome / AGAPE

AGAPE (Automated Genome Analysis PipelinE) for yeast pan-genome analysis
3 stars 9 forks source link

Method to create fungi_protein.fasta and fungi_est.fasta #3

Closed kastman closed 6 years ago

kastman commented 6 years ago

Hi @gtsong ,

In the AGAPE paper, you say:

Regions within the contigs, which remained un-annotated or that were labeled ‘undefined’ in the initial phase of AGAPE were analyzed with the MAKER pipeline using all available fungal proteins (downloaded from http://fungi.ensembl.org) and ESTs (downloaded from http://fungidb.org).

Can you elaborate a little more on what you did to download the protein and ESTs? Did you write a script to pull all the "Transcript sequences" fastas from the fungidb FTP server and then concatenate them? If so, would it be possible to share it? If so, why not use both the "Proteins, translated CDS (AA)" and ESTs from fungidb, instead of using fungi.ensembl.org ?

Thanks,

kastman commented 6 years ago

Was able to find fungi_est.fasta and fungi_protein.fasta that were deposited at the SGD Yeastgenome site, but I'm still looking for a description of exactly how they were obtained. For now, this is sufficient to replicate the pipeline:

https://downloads.yeastgenome.org/published_datasets/Song_2015_PMID_25781462/AGAPE_cfg_files.tar.gz