Some notes from Adi as I dug into the code and started scaffolding this
I am not sure that it makes sense to include this in the combine transform.
Thinking down the line, I feel like we should be trying to find features that look like the following:
If conversion rate of the brand <1% && they don't have a Shopify site, then they are likely fraudulent
To get there we would need to do the following:
Allow for arithmetic combines
Do some casts across columns to make them categorical at different thresholds (could be pre-defined or system generated)
Run a combine function specifically centered around and / or of categorical AND boolean columns as candidates
2 is not done yet, but that is fine, we can implement #3 assuming we have categorical / boolean columns (because we should have some.
Some notes from Adi as I dug into the code and started scaffolding this
I am not sure that it makes sense to include this in the combine transform.
Thinking down the line, I feel like we should be trying to find features that look like the following: If conversion rate of the brand <1% && they don't have a Shopify site, then they are likely fraudulent
To get there we would need to do the following:
2 is not done yet, but that is fine, we can implement #3 assuming we have categorical / boolean columns (because we should have some.