Closed martinmm closed 2 years ago
Your humidity value should never be zero. Values below 20% or so should be Python None
as the station hardware can't measure such dry air. If your station is regularly reporting bad data like this then you should use a user calibration module to correct it.
The data is stored in your raw
data files. Each row in the CSV file is a single record. The internal and external humidity are the 3rd and 5th values on the row. To replace them with None
, delete the value. For example, replace
2014-01-03 02:41:37,5,50,18.7,81,10.2,986.4,1.4,2.7,8,214.5,0
with
2014-01-03 02:41:37,5,,18.7,,10.2,986.4,1.4,2.7,8,214.5,0
.
Yes, this seems to be bogus data, not measured. It happened several times in the past and I was happy I could avoid it with that simple check.
The raw data shows bogus values on two days:
2022-08-30 15:48:00,30,54,23.8,54,22.4,457,3.1,4.1,2,63.3,0 2022-08-30 16:18:00,30,55,23.5,51,22.5,456.3,0.3,1,4,63.3,0 2022-08-30 16:48:00,30,55,23.3,57,22.2,455.4,5.8,7.1,2,63.3,0 2022-08-30 17:18:00,30,56,23.2,63,21.1,454.1,4.4,7.1,2,63.3,0 2022-08-30 17:36:50,30,78,20.6,90,18.8,1000.8,2.7,3.7,6,50.7,0 2022-08-30 17:47:15,41,0,-2125,0,768,5459,102.4,289.2,0,1364.4,48 2022-08-30 18:06:50,30,79,20.4,90,18.6,1000.4,2.7,3.4,8,50.7,0 2022-08-30 18:36:50,30,77,20.2,90,18.4,1000.5,2,3.1,10,50.7,0 2022-08-30 18:45:15,58,0,-2125,0,768,5356.7,256,264,0,1359.6,31 2022-08-30 19:06:50,30,77,20,91,18.3,1000.6,2.4,3.7,8,50.7,0 2022-08-30 19:26:15,41,0,-2125,0,768,5280,179.2,264.2,0,1355.1,27 2022-08-30 19:36:50,30,79,19.9,87,19.1,1001.2,3.4,4.4,10,50.7,0
Do you prefer manual user interaction for that, shall I close the pull request? I would think that avoiding a zero in the log() makes the software more robust.
This sort of check is station specific. Your data shows problems with many values, not just the humidity. Filtering out bad data is best done in a user calibration module. Different stations fail in different ways, so I prefer not to weigh pywws down with checks that most users don't need.
Your data appears to be at 30 minute intervals (although most users prefer something shorter, typically 5 minutes) but the bad data appears at other times. There's something odd going on.
Sometimes odd old values from files break pywws. I could not figure out where this is stored.
File "/usr/local/lib/python3.9/dist-packages/pywws/conversions.py", line 199, in dew_point gamma = ((a * temp) / (b + temp)) + math.log(float(hum) / 100.0) ValueError: math domain error
An addtional check fixes this.