Closed cmgosnell closed 1 year ago
I'm not sure if this would be the right thing to do, but might it make sense to apply a standard scaler to the numeric columns that have wildly different sizes? I don't feel like have a super clear grasp on when that's appropriate though. I could also imagine filling in a dummy value for especially the heat content consumed based on the primary fuel type, net generation, and an expected heat rate -- so at least you'd have some value in there, that's in the right ballpark, to compare with whatever is available from FERC, if anything.
no longer relevant.
I'm not sure if the comparison metric for
total_mmbtu
andtotal_fuel_cost
are low because there are so many null values or because the range is so large. I tweaked the comparison feature creation a bit but I'd like to fine tune it.. or determine whether or not this is actually a problem.