The API should remove hgvs columns (hgvs_nt, hgvs_splice, hgvs_pro) in an input data frame if they are all NA.
I ran into an issue helping a user where they had sensibly based their input on the MaveDB output format, which includes all-NA hgvs columns including hgvs_splice. This caused the validation code (which inspects the column names in the data frame) to assume that all hgvs_nt variants should have the g. prefix instead of the c. prefix that was present and display the following error:
I would imagine we would get similar errors if (for example) someone uploaded a noncoding dataset with n. variants and an all-NA hgvs_pro column.
Since I can't come up with any reason why someone would want an all-NA hgvs column in their upload for any productive purpose, it seems most straightforward to just drop them rather than rewriting the validation code to check the column contents.
The API should remove hgvs columns (
hgvs_nt
,hgvs_splice
,hgvs_pro
) in an input data frame if they are all NA.I ran into an issue helping a user where they had sensibly based their input on the MaveDB output format, which includes all-NA hgvs columns including
hgvs_splice
. This caused the validation code (which inspects the column names in the data frame) to assume that allhgvs_nt
variants should have theg.
prefix instead of thec.
prefix that was present and display the following error:I would imagine we would get similar errors if (for example) someone uploaded a noncoding dataset with
n.
variants and an all-NAhgvs_pro
column.Since I can't come up with any reason why someone would want an all-NA hgvs column in their upload for any productive purpose, it seems most straightforward to just drop them rather than rewriting the validation code to check the column contents.