EBI-Metagenomics / MGnifyR

R package for searching, downloading and analysis of EBI MGnify metagenomics data
https://ebi-metagenomics.github.io/MGnifyR/
Artistic License 2.0
19 stars 10 forks source link

mgnify_get_analyses_results stops with the error: Error in t.default(unlist(c(json$attributes[baseattrlist], metlist))) : argument is not a matrix #3

Closed rigormortis0 closed 2 years ago

rigormortis0 commented 2 years ago

Hello @beadyallen and @bsattelb, I have been using this package successfully but recently I encountered an issue while using mgnify_get_analyses_results

> tax_results <- mgnify_get_analyses_results(mg, accession_list, retrievelist = c("taxonomy"), usecache = T)

 |======================================                                |  55%
Error in t.default(unlist(c(json$attributes[baseattrlist], metlist))) :
      argument is not a matrix
Calls: mgnify_get_analyses_results ... mgnify_attr_list_to_df_row -> as.data.frame -> t -> t.default

Previous attempts using a smaller list of analyses accessions accession_list worked well, but my last tries have resulted in this error at different points of completion sometimes at 22%, 55%. I am not sure what is the cause of this error.

I have attached the accession_list if possible to reproduce the error.

Thank you for developing this package and for your availability.

accession_list.csv

beadyallen commented 2 years ago

Hi @rigormortis0 , thanks for using the package. I've finally got around to testing your accession_list, and it looks like MGYA00575370 isn't a valid analysis accession. Likely this is caused by an inconsistency in the backend MGnify database, and isn't directly a problem with MGnifyR. Can you figure out where you got that particular accession from?

At the moment, the retrieval code doesn't perform much error checking, relying instead on end users to sanity check results in case things go wrong. To get around the issue in general though, you can, for example, use an sapply/lapply and a tryCatch:

e.g. (untested)

# Loop over each accession one by one, handling any errors by 
results <- lapply(accession_list, function(x){
   tryCatch(
       mgnify_get_analyses_results, x, retrievelist=c("taxonomy"), usecache=T)
      , error=function(paste("Failed ",x))
 })

#Use rbind.fill from plyr to join all the dataframes into to one. Needs a little mangling to get lists into vectors, and to only process entries that are data.frames
full_dataframe <- do.call(rbind.fill, as.vector(res[sapply(res, is.data.frame)]))

Good luck

Ben