hammerlab / cohorts

Utilities for analyzing mutations and neoepitopes in patient cohorts
Apache License 2.0
20 stars 4 forks source link

Add a median VAF filter function and a filter_fn watermark #189

Closed tavinathanson closed 7 years ago

tavinathanson commented 7 years ago

Moving some things from my data.py to Cohorts.

Also in this PR:

tavinathanson commented 7 years ago

@jburos added some more stuff, feel free to review at this point.

jburos commented 7 years ago

Nice - thanks @tavinathanson ! LGTM. I like the use of strip_column_name. If the conflict happens a lot we may want to catch the error & provide a more descriptive message in the init function for either Patient or Sample. But for now it's probably good as-is.

tavinathanson commented 7 years ago

@jburos I had to fix a few failing tests due to e.g. os existing in additional_data as well as in Patient already. I'm fixing by removing that value from the additional_data dictionary, which actually seems correct based on the meaning of "additional data". But we'll definitely hit this error in our other cohorts. A simple replacement of e.g. id=row["id"] with id=row.pop("id") should fix things for all problematic columns in RCC/bladder/etc.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.04%) to 57.706% when pulling 44938b791cd5744a596081db0c45fabc17d70212 on minor_updates into 616cb5f2c27d34cd131a98ffbb556b2aa3dde9f7 on master.