hail-is / hail

Cloud-native genomic dataframes and batch computing
https://hail.is
MIT License
982 stars 246 forks source link

[query] bad error message when indexing a matrix table with row keys #14237

Open danking opened 9 months ago

danking commented 9 months ago

What happened?

I expected this to raise an error with a message like "when indexing a matrix table, you must provide a row key and column key".

In [3]: import hail as hl
   ...: mt = hl.balding_nichols_model(1, 1, 1)
   ...: mt2 = hl.balding_nichols_model(1,1,1)
   ...: 
   ...: mt.annotate_rows(x=mt2[mt.locus, mt.alleles])
2024-02-01 13:16:23.573 Hail: INFO: balding_nichols_model: generating genotypes for 1 populations, 1 samples, and 1 variants...
2024-02-01 13:16:23.594 Hail: INFO: balding_nichols_model: generating genotypes for 1 populations, 1 samples, and 1 variants...
---------------------------------------------------------------------------
ExpressionException                       Traceback (most recent call last)
Cell In[3], line 5
      2 mt = hl.balding_nichols_model(1, 1, 1)
      3 mt2 = hl.balding_nichols_model(1,1,1)
----> 5 mt.annotate_rows(x=mt2[mt.locus, mt.alleles])

File ~/projects/hail/hail/python/hail/matrixtable.py:818, in MatrixTable.__getitem__(self, item)
    815 col_key = wrap_to_tuple(exprs[1])
    817 try:
--> 818     return self.index_entries(row_key, col_key)
    819 except TypeError as e:
    820     raise invalid_usage from e

File ~/projects/hail/hail/python/hail/matrixtable.py:3193, in MatrixTable.index_entries(self, row_exprs, col_exprs)
   3191     return self.index_entries(tuple(row_exprs[0].values()), col_exprs)
   3192 elif len(row_exprs) != len(self.row_key):
-> 3193     raise ExpressionException(
   3194         f'Key mismatch: matrix table has {len(self.row_key)} row key fields, '
   3195         f'found {len(row_exprs)} index expressions'
   3196     )
   3197 else:
   3198     raise ExpressionException(
   3199         f"Key type mismatch: Cannot index matrix table with given expressions\n"
   3200         f"  MatrixTable row key:   {', '.join(str(t) for t in self.row_key.dtype.values())}\n"
   3201         f"  Row index expressions: {', '.join(str(e.dtype) for e in row_exprs)}"
   3202     )

ExpressionException: Key mismatch: matrix table has 2 row key fields, found 1 index expressions

Version

0.2.127

Relevant log output

No response

danking commented 9 months ago

NOTE: This issue is only about the error message. We can definitely produce a more insightful error message (perhaps suggesting the use of .rows()[key_field1, key_field2]) without also addressing the confusing syntax.