AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
126 stars 19 forks source link

Change Sample property genotype to genetic_information #3408

Closed davidsmejia closed 8 months ago

davidsmejia commented 8 months ago

Context

We currently use genotype as the attribute on the Sample model. This should have been changed to the more appropriately named genetic_information.

https://github.com/AlexsLemonade/refinebio-admin/pull/116#issuecomment-1605154840

The attribute on the sample is genotype, but the way that the harmonizer was written it assigns the value to sample.gentetic_information which does match the docs. We should either change the docs to say genotype instead and update the harmonizer (which I will do for now) or make a migration on the DB to change the name of the column to genetic_information.

Problem or idea

Updating this should be pretty straightforward.

One caveat is that the harmonizer. For the accompanying PR for this issue lets not rewrite the harmonizer and instead rename the genotype_fields, self.genotype_fields, and the string that is passed in here: https://github.com/AlexsLemonade/refinebio/blob/dev/foreman/data_refinery_foreman/surveyor/harmony.py#L800 to matching versions of genetic_information.

Solution or next step