waldronlab / TCGAutils

Toolbox package for organizing and working with TCGA data
https://bioconductor.org/packages/TCGAutils
22 stars 6 forks source link

mapping between case id and protein expression file id #30

Closed eajeong closed 2 years ago

eajeong commented 2 years ago

It doesn't seems that TCGAutils supports protein expression file's information.

If I use UUIDtoBarcode for a protein expression file id (43864799-84c0-4c4b-8056-b3571c3e135b), I got the error message as follows:

UUIDtoBarcode("43864799-84c0-4c4b-8056-b3571c3e135b", from_type = "file_id") Error in UUIDtoBarcode("43864799-84c0-4c4b-8056-b3571c3e135b", from_type = "file_id") : No barcodes found, only case and file UUIDs are supported.

When I check UUIDtoUUID for the corresponding case id (15fb07ae-101c-4553-891d-539cee89a5e9) of the above file id, there is no protein file's information.

head(UUIDtoUUID("15fb07ae-101c-4553-891d-539cee89a5e9", to_type = "file_id")) case_id files.file_id 1 15fb07ae-101c-4553-891d-539cee89a5e9 8b94ac23-9f47-48ea-9f5f-23dd8a04bb95 2 15fb07ae-101c-4553-891d-539cee89a5e9 8b487f52-685f-429d-8efe-469978367395 3 15fb07ae-101c-4553-891d-539cee89a5e9 05b8efda-aa93-4c04-8beb-ffc95c062af9 4 15fb07ae-101c-4553-891d-539cee89a5e9 2c9512eb-e075-4e45-8ad4-b00048a6d47e 5 15fb07ae-101c-4553-891d-539cee89a5e9 81ac2c46-37db-4dcd-923a-061a7ae626a3 6 15fb07ae-101c-4553-891d-539cee89a5e9 51c30ee1-d27d-4b60-8786-958a5b29276e

You can find the protein expression file from the site (https://portal.gdc.cancer.gov/legacy-archive/files/43864799-84c0-4c4b-8056-b3571c3e135b)

Thank you in advance.

LiNk-NY commented 2 years ago

Hi Euna, @eajeong Sorry for getting back to you so late. Please use the legacy argument in the function. Unfortunately, it is not possible to auto-detect whether an ID is present in the legacy archives or not so this input has to come from the user after possibly querying the data on the GDC website.

library(TCGAutils)
UUIDtoBarcode("43864799-84c0-4c4b-8056-b3571c3e135b", from_type = "file_id", legacy = TRUE)
#'                                file_id
#' 1 43864799-84c0-4c4b-8056-b3571c3e135b
#'   associated_entities.entity_submitter_id
#' 1             TCGA-OR-A5LT-01A-21-A39K-20

Best regards, Marcel

eajeong commented 2 years ago

Thank you for your help.