thackl / gggenomes

A grammar of graphics for comparative genomics
https://thackl.github.io/gggenomes/
Other
581 stars 64 forks source link

issue with read_seqs() #62

Closed aleuUH closed 3 years ago

aleuUH commented 3 years ago

Hi @thackl

I am very interested in trying out your package (looks amazing!). However i am having trouble importing data:

read_seqs(ex("emales/emales.fna"), parse_desc=FALSE) Reading 'fasta' with read_seq_len():

  • file_id: emales [C:/Users/leua/Documents/R/win-library/4.0/gggenomes/extdata/emales/emales.fna]

    A tibble: 0 x 1

    ... with 1 variable: file_id

Do you know why that is?

Thanks, Andy

thackl commented 3 years ago

Hi Andy,

ah, yes sorry about that. I suspect that some calls on file might have an issue on Windows. I haven't done a lot of testing on Windows yet (only Linux and Mac). As a quick workaround, try reading the following file instead. It contains the same information - ids and length of the sequences but already parsed out from the fasta file using seqkit faidx -f emales.fna (https://bioinf.shenwei.me/seqkit/usage/#faidx). For your own fasta files, you could do the same indexing with seqkit.

read_seqs(ex("emales/emales.fna.seqkit.fai"))
aleuUH commented 3 years ago

ah. Seems to be working now.

Thanks for the quick response!

thackl commented 3 years ago

Great!

(Note to self: related to #36)

gunjanpandey commented 1 year ago

I am still getting this error. Is this sorted.

`library("gggenomes") library("seqTools") library("curl")

curl_download("https://ftp.ncbi.nlm.nih.gov/genomes/refseq/invertebrate/Bactrocera_neohumeralis/representative/GCF_024586455.1_APGP_CSIRO_Bneo_wtdbg2-racon-allhic-juicebox.fasta_v2/GCF_024586455.1_APGP_CSIRO_Bneo_wtdbg2-racon-allhic-juicebox.fasta_v2_genomic.fna.gz", destfile = "neo.fna")

writeFai("neo.fna", "neo.fai") read_seqs(ex("neo.fai"))

Error in system.file("extdata", file, package = "gggenomes", mustWork = TRUE) : no file found`

iimog commented 1 year ago

The only problem is the ex in your last line. If you change this line to read

read_seqs("neo.fai")

your code works for me.