Closed kafker closed 2 years ago
Hi,
I can’t reproduce the error with those specific accessions, but in general this sort of thing happens when the backend MGnify database is not entirely consistent (which is reasonably often – samples not pointing to an analysis, multiple projects using the same sample, assemblies not having an associated project, etc etc). It’s not (I don’t think) a bug in MgnifyR, but rather something ~should~ be fixed elsewhere.
That said, clearly there needs to be a workaround to be able to use MgnifyR in the way it’s intended. What I tend to do is loop over the individual accessions, and wrap in a tryCatch. So in your case, you could try:
downloads <- sapply(mglist$V1, function(x){ tryCatch(mgnify_get_download_urls(mg, x, accession_type="analyses"), error=function(y){ cat(paste("Failed to retrieve",x)); NA} )} )
Your attachment seems to have exported a little funny (try using tab separation and quoting text), but I think you’ve got 4326 unique accessions you’re trying to retrieve. I’ve got the above code running on that list, and will update you once it’s finished. One important thing to look for in the metadata is that the “analyses” are “completed”. Entries can end up in the database (and therefore API results) when there’s no data to actually show because e.g. a run failed.
Ben, I really appreciated your help, and thank you for the chunk of code.
I did not know about the "completed" analysis in the metadata file. This is really helpful.
Best
No problem. By the way - it looks like MGYA00100742 might be the problem. All the rest of the accessions have "analysis_analysis-status" set to “completed”, whereas that one is “QC not passed”. It’s the only problematic one I can see. Am just running the download retrieval on an updated list (minus the bad one) to check it works with a standard "mgnify_get_download_urls".
There's a good argument from a ease-of-use point of view that MGnifyR should handle errors like this. It's not been implemented yet because really it's a core database issue, but once more users begin finding problems, we might have to rethink.
Edit: The download retrieval does complete successfully once MGYA00100742 is removed from the query accessions.
As the title say the
mgnify_get_download_urls
function stop with the following error:This error is triggered by some of the accessions I have in
Marine_samples_metagenome
, e.g. MGYA00278521 or MGYA00278684 which have only two URL. However, it does not trigger If I launchmgnify_get_download_urls
on a single accession:dl_urls_MGYA00278684<-mgnify_get_download_urls(mg, "MGYA00278684", accession_type = "analyses")
I have attached
Marine_samples_metagenome
to reproduce the error Marine_samples_metagenome.txt