NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
4 stars 0 forks source link

Rate of zeros in each feature #335

Open sigmafelix opened 4 months ago

sigmafelix commented 4 months ago

We briefly discussed how to treat excessive true zeros in features even after postprocessing and imputation. The fraction of zeros per feature will give us a good reference to determine our measures.

Percentile 50 60 70 80 90 100
Number of features 2218 2174 2100 2031 1881 1247

@kyle-messier

sigmafelix commented 3 months ago

The current imputation procedure includes all features unless any of these has zero variance. We could easily adjust the base function to change that.