hammerlab / cohorts

Utilities for analyzing mutations and neoepitopes in patient cohorts
Apache License 2.0
20 stars 4 forks source link

Annoying conflicts between patient-attributes & variables in `additional_data` #220

Open jburos opened 7 years ago

jburos commented 7 years ago

I often create Patient objects where a required attribute (e.g. os) is also in the clinical data for that subject. I tend to name these fields using self-explanatory but short names, which more often than not conflict with the input attributes set on the same object.

E.g. if I create a Patient object where the clinical data contains a field os, I get the following error:

ValueError: Key `os` in additional_data already exists in this object

The solution appears to be for me to either:

  1. rename my key fields, so they don't conflict with the default names of attributes set by Patient,
  2. drop key fields from the clinical data

Neither feels ideal. It would be nice (but not urgent) if we could avoid these conflicts between the names of Patient attributes set by Cohorts.Patient & the name of fields in the dataframe (additional_data), perhaps by giving one or the other an unnatural prefix, thereby limiting potential for conflicts.