I have a dataset of about 18000 rows. For a particular column, just over half of the values are 0 (the data measures solar irradiance, and these are nighttime measurements). The remaining values are integers distributed between 8 and 154. The most commonly repeated nonzero values appear about 150 times each. The column's data has around three prominent modes and reasonable-looking tails. Bayeslite is guessing NOMINAL for this column instead of NUMERICAL.
Hi @apuranik1, just wondering if you could send me this dataset when you get a chance, so I could look into what exactly about the current stattype guessing heuristics caused this to occur.
I have a dataset of about 18000 rows. For a particular column, just over half of the values are 0 (the data measures solar irradiance, and these are nighttime measurements). The remaining values are integers distributed between 8 and 154. The most commonly repeated nonzero values appear about 150 times each. The column's data has around three prominent modes and reasonable-looking tails. Bayeslite is guessing NOMINAL for this column instead of NUMERICAL.