wejlab / MetaScope

An R-based approach for preprocessing and aligning 16S, metagenomic, and metatranscriptomic data (PathoScope version 3.0)
GNU General Public License v3.0
16 stars 7 forks source link

Some error when I run the download_refseq #14

Closed kunlingtianxia closed 2 years ago

kunlingtianxia commented 2 years ago

Hi, when I run the download_refseq "Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 514 did not have 23 elements", how can I resolve this? Thank you very much! image

hjfan527 commented 2 years ago

Hi, so the problem seems to be stemming from using the FTP as the reference genome link to download the reference genome summary table from NCBI. I've switched the protocol from FTP to HTTPS, which appears to fix the problem. (pull request : #15)

aubreyodom commented 2 years ago

Just pulled in Howard's changes. Let me know if you are still having issues after you download the latest version @kunlingtianxia

yfuruta commented 2 years ago

Hi,

Sorry for adding a comment on a closed issue. I had a same kind of error when running download_refseq but with different line number. Different from the case of @kunlingtianxia, I always get the same line number regardless of taxon. Installed the latest MetaScope from github, R version is 4.1.3.

image

Any idea to resolve this? Thank you so much for developping this great tool!

aubreyodom commented 2 years ago

I just now pushed my most recent changes, which had included a reorganization of download_refseq(), but they wouldn't have included any major changes to how the function works. The line 1208833 is likely referring to a table generated during download_parentkingdom() which calls get_table(). I did create these functions from existing code and slightly alter the process of during the reorganization.

Currently, that functional call works just fine for me, and I was able to download the Xylella fastidiosa genome without issue.

Can you try it again with this newest update and let me know if you are again experiencing an issue?

yfuruta commented 2 years ago

Hi @aubreyodom,

Thanks for the prompt response! I reinstalled the MetaScope with the latest update but got the same error unfortunately. I will try other environment to figure out if this is an environment dependent issue (Current one is Win10, RStudio with R4.1.3, with some firewalls).

aubreyodom commented 2 years ago

Hmm, ok. I usually build MetaScope on linux running R 4.1.2, but I could try it on my local PC which would match your specifications. I'll get back to you soon.

aubreyodom commented 2 years ago

Still no issues for me with windows 10 and R 4.1.2 (I should update)- the line for 120833 does not have 23 lines, but I didn't have a problem with the last few entries being blank or NA.

The code that is causing a problem is, I think, only one line. I tried setting fill = TRUE, which I think should circumvent the issue while maintaining the table structure. Can you let me know if this fixes it?

refseq_link <- "https://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt" refseq_table <- utils::read.table(refseq_link, header = TRUE, sep = "\t", comment.char = "", quote = "", skip = 1, fill = TRUE) ** Note: I also tried this on Win10 with the 4.1.3 update and it worked without errors.

hjfan527 commented 2 years ago

Hi @yfuruta,

Just a quick check to see if we're working with the same code, if you run MetaScope:::download_refseq, do you see the following? image

yfuruta commented 2 years ago

Hi @aubreyodom & @hjfan527

Thanks for the responses! It worked!

At first, I checked the MetaScope::download_refseq as @hjfan527 suggested and it was identical as in the screenshot.

After modifying the source for the line getting refseq_link and refseq_table as @aubreyodom suggested, reinstalling, and running download_refseq, the download started correctly!

By the way, download_refseq worked without any modification in my MacOSX (Catalina) and Ubuntu 20.04 environments, so it was an issue specific to my Win10 environment.

Anyway, thank you so much for your support! I appreciate it!

aubreyodom commented 2 years ago

Hi @yfuruta,

Great to hear! I will add my fix to the source code so prevent possible future errors with grabbing the table.

You're very welcome! Please let us know if any other issues pop up.