QuackenbushLab / NetworkDataCompanion

An R library of utilities for performing analyses on TCGA and GTEx data using the Network Zoo
GNU General Public License v3.0
4 stars 0 forks source link

reworked barcode filtering to avoid requiring RDS object; added tests #31

Closed katehoffshutta closed 1 year ago

katehoffshutta commented 1 year ago

@FischerJoBio Can you please review the changes to the filtering by barcode functions and confirm they look okay to you? I have removed the rds object argument and the functions now act on barcodes only.

FischerJoBio commented 1 year ago

@katehoffshutta the suggested change changes the sample type from names (e.g., "Primary Tumor") to barcode numbers (e.g. "01"). It, hence, requires the user to know this mapping of numbers to sample types. I would suggest an internal mapping of these intuitive names to numbers. If not, the manual entry and usage in the intro vignette would need to be adapted to this encoding.

katehoffshutta commented 1 year ago

Good point - I did this to enable more user flexibility (e.g. to select primary tumors but not metastases) but it does need some documentation now. I will add a link to the GDC mapping in the man pages for these functions and then update this PR.

Do you think it would be more intuitive to use either of the other two columns (Definition or Short Letter Code)? If so, we could provide an internal mapping from those values to the integer codes, but my preference is to refer the user to the outside reference from GDC for consistency.

FischerJoBio commented 1 year ago

The currently used rds object should already reflect the names behind the official GDC table meaning that also 'Metastatic' can be used to access the corresponding metastases data. I do not have a preference, but think that a description that does not require a manual lookup would be more user-friendly. This could be either the written-out type, or any of the other two with a helper function printing out the mapping from short code to written-out type.

katehoffshutta commented 1 year ago

@FischerJoBio changes adding sample type mapping are now included. Look OK?