Bioconductor / GenomicDataCommons

Provide R access to the NCI Genomic Data Commons portal.
http://bioconductor.github.io/GenomicDataCommons/
83 stars 23 forks source link

Manifest won't generate for larger groups of files #30

Closed wwysoc2 closed 7 years ago

wwysoc2 commented 7 years ago

The manifest endpoint seems to consistently break after 108 files. See Below:


> query = files(fields=fields,filters=make_filter("experimental_strategy"=="RNA-Seq"&"data_format"=="BAM"),size=20000)
> query
class: files_list
files: 11607
names:
    a7fd6aae-6af5-490f-b65e-97c7bfcf44bf, 7d810229-fa70-4df1-88d6-82e40618ec81, c9eebf0c-3768-43a1-b5c5-874a0d4843c2, ...,
    cf153337-d01f-40bd-8547-92af3f344217, 37b6f86e-c70f-4f01-965e-f18ed9024dd4
> mf2 = manifest(uuids=names(query[1:109])) #Works fine
> mf2 = manifest(uuids=names(query[1:110]))
Error in .gdc_download_one(uri, destination, overwrite = FALSE, progress = FALSE,  : 
  Not Found (HTTP 404).
>  
seandavi commented 7 years ago

I made some significant changes. In particular, manifest is no longer treated as an endpoint. Instead, it works directly from a files query object (or its result) and now returns a data.frame.

Your code in this new framework would look like:

> respMan = files() %>% 
    select(fields=c('file_name','cases.submitter_id')) %>% 
    filter(~ experimental_strategy == "WXS" & data_format == "BAM") %>% 
    manifest()
> respMan
# A tibble: 22,893 × 5
                                     id
*                                 <chr>
1  164511a9-2f56-49e0-b5cf-9c4be32f8fc7
2  61cb79e3-ec5f-4211-b41d-64ff63b30d0e
3  28a75d46-5b6d-4c40-8472-fccc77e3a2d9
4  0c7aeaff-8408-488d-b1a7-bf52030fec78
5  f715367e-077b-462a-aa5c-79ea61bf0067
6  07804909-3b25-496a-9f50-0f46bf7d7c03
7  1da80f23-c59f-4ace-ba93-a884e49a0ca5
8  ea977b14-a8dc-47c5-aebf-d66e134e057d
9  260e4af3-af8c-42e8-9914-059a08f9d8ae
10 943e35da-f8f6-4991-8511-ad6bf962a66b
# ... with 22,883 more rows, and 4 more variables: filename <chr>, md5 <chr>,
#   size <dbl>, state <chr>

Let me know how that works for you.

seandavi commented 7 years ago

Closing, as I think this has been addressed. Please let me know if it doesn't and we can reopen.