Wytamma / GISAIDR

Programmatically interact with the GISAID database.
72 stars 9 forks source link

Error in Matches #57

Open rafischulman opened 2 months ago

rafischulman commented 2 months ago

Hi,

Thank you so much for this excellent tool. I've been using it as an efficient way to query many sequences from across a region. Lately I've been having some trouble downloading sequences and metadata. See below for my error message.

I used one of your query examples from the readme to get a list of accession IDs, then hit the error when I attempt to download.

Thanks for the help.

GISAID login

credentials <- login(username = "", password = "")

df <- query(

  • credentials = credentials,
  • location = "Oceania",
  • from_subm = "2024-07-26",
  • to_subm = "2024-07-28",
  • fast = TRUE
  • ) Selecting all 6 accession_ids. Returning 0-6 of 6 accession_ids.

head(df$accession_id) [1] "EPI_ISL_19293748" "EPI_ISL_19293749" "EPI_ISL_19293750" "EPI_ISL_19293751" "EPI_ISL_19293752" "EPI_ISL_19293753"

full_df <- download(credentials = credentials, list_of_accession_ids = df$accession_id, get_sequence=TRUE) Selecting entries... Error in matches[[1]][[2]] : subscript out of bounds

PabloOfEpidemiology commented 2 months ago

Getting the same error!

Wytamma commented 2 months ago

It looks like gisaid has added a capture to stop automated access. I'd suggest saving the accession_ids to a file and manually entering them into the full text search field.

image

PabloOfEpidemiology commented 2 months ago

Interestingly I usually can download a first few batches upon the first run of the script, but then around 6th batch (1 batch is only 10 sequences) the error comes out. Here's the console: Downloading batch 1 out of 40 Selecting entries... Compressing data. Please wait... Data ready. Downloading... tar: Removing leading '/' from member names Downloading batch 2 out of 40 Selecting entries... Compressing data. Please wait... Data ready. Downloading... tar: Removing leading '/' from member names Downloading batch 3 out of 40 Selecting entries... Compressing data. Please wait... Data ready. Downloading... tar: Removing leading '/' from member names Downloading batch 4 out of 40 Selecting entries... Compressing data. Please wait... Data ready. Downloading... tar: Removing leading '/' from member names Downloading batch 5 out of 40 Selecting entries... Compressing data. Please wait... Data ready. Downloading... tar: Removing leading '/' from member names Downloading batch 6 out of 40 Selecting entries... Encountered an error: subscript out of bounds

Then if I rerun the script, the error happens during the very first run. So it takes time for their UI to detect automation.

dawnmy commented 2 months ago

It looks like gisaid has added a capture to stop automated access. I'd suggest saving the accession_ids to a file and manually entering them into the full text search field.

image

This really sucks