wandreopoulos / deeplasmid

12 stars 2 forks source link

Need your help regarding the training dataset #9

Closed a-piece-of-teemo closed 5 months ago

a-piece-of-teemo commented 7 months ago

Hi, I would like to use your dataset to perform different length cuts. Therefore, could you provide the uncut version of the file "archaea.txt.fasta.MIN1kMAX330k.fasta" and its corresponding annotation file from the training data on your website https://portal.nersc.gov/dna/microbial/assembly/deeplasmid/DATA/TRAIN/? Additionally, could you also provide the annotation file corresponding to "refseq.bacteria.nonplasmid.nonmito.fasta.subsam40kreads.fasta"?

wandreopoulos commented 7 months ago

the files you request are from ncbi refseq. It is possible to download the full bac and archaea files from ncbi. https://www.ncbi.nlm.nih.gov/refseq/

On Mon, Dec 4, 2023 at 12:47 AM a-piece-of-teemo @.***> wrote:

Hi, I would like to use your dataset to perform different length cuts. Therefore, could you provide the uncut version of the file "archaea.txt.fasta.MIN1kMAX330k.fasta" and its corresponding annotation file from the training data on your website https://portal.nersc.gov/dna/microbial/assembly/deeplasmid/DATA/TRAIN/? Additionally, could you also provide the annotation file corresponding to "refseq.bacteria.nonplasmid.nonmito.fasta.subsam40kreads.fasta"?

— Reply to this email directly, view it on GitHub https://github.com/wandreopoulos/deeplasmid/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANGW5IMOZROPQ7J75HFCVTYHWE3PAVCNFSM6AAAAABAFTADLGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDGMZSGY2TMNY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Thanks, Bill


William B. Andreopoulos, Ph.D. Joint Genome Institute LBNL