mindsdb / type_infer

Type inference for Machine Learning pipelines
GNU General Public License v3.0
17 stars 7 forks source link

When pct_invalid=0, no rows should be dropped #3

Closed paxcema closed 1 year ago

mrandri19 commented 2 years ago

Is this bug caused by these two dropnas (https://github.com/mindsdb/type_infer/blob/staging/type_infer/infer.py#L383-L390)? Because I could not find any other traces of data being skipped related to pct_invalid

paxcema commented 2 years ago

Most likely, yes. I'm not 100% familiar with this feature though (was implemented by another engineer). I do think a test should be part of the fix so that we know it's working as intended in the future.

paxcema commented 1 year ago

Closing, no longer accurate to say pct_invalid has a say in what rows are dropped prior to the type inference step.