VariantEffect / mavedb-api

MaveDB API
GNU Affero General Public License v3.0
8 stars 2 forks source link

All-NA hgvs columns in input #230

Open afrubin opened 2 weeks ago

afrubin commented 2 weeks ago

The API should remove hgvs columns (hgvs_nt, hgvs_splice, hgvs_pro) in an input data frame if they are all NA.

I ran into an issue helping a user where they had sensibly based their input on the MaveDB output format, which includes all-NA hgvs columns including hgvs_splice. This caused the validation code (which inspects the column names in the data frame) to assume that all hgvs_nt variants should have the g. prefix instead of the c. prefix that was present and display the following error:

image

I would imagine we would get similar errors if (for example) someone uploaded a noncoding dataset with n. variants and an all-NA hgvs_pro column.

Since I can't come up with any reason why someone would want an all-NA hgvs column in their upload for any productive purpose, it seems most straightforward to just drop them rather than rewriting the validation code to check the column contents.