jim-easterbrook / pywws

Python software for USB Wireless WeatherStations
https://pywws.readthedocs.io/
GNU General Public License v2.0
204 stars 62 forks source link

Do additional check on humidity value #106

Closed martinmm closed 2 years ago

martinmm commented 2 years ago

Sometimes odd old values from files break pywws. I could not figure out where this is stored.

File "/usr/local/lib/python3.9/dist-packages/pywws/conversions.py", line 199, in dew_point gamma = ((a * temp) / (b + temp)) + math.log(float(hum) / 100.0) ValueError: math domain error

An addtional check fixes this.

jim-easterbrook commented 2 years ago

Your humidity value should never be zero. Values below 20% or so should be Python None as the station hardware can't measure such dry air. If your station is regularly reporting bad data like this then you should use a user calibration module to correct it.

The data is stored in your raw data files. Each row in the CSV file is a single record. The internal and external humidity are the 3rd and 5th values on the row. To replace them with None, delete the value. For example, replace 2014-01-03 02:41:37,5,50,18.7,81,10.2,986.4,1.4,2.7,8,214.5,0 with 2014-01-03 02:41:37,5,,18.7,,10.2,986.4,1.4,2.7,8,214.5,0.

martinmm commented 2 years ago

Yes, this seems to be bogus data, not measured. It happened several times in the past and I was happy I could avoid it with that simple check.

martinmm commented 2 years ago

The raw data shows bogus values on two days:

2022-08-30 15:48:00,30,54,23.8,54,22.4,457,3.1,4.1,2,63.3,0 2022-08-30 16:18:00,30,55,23.5,51,22.5,456.3,0.3,1,4,63.3,0 2022-08-30 16:48:00,30,55,23.3,57,22.2,455.4,5.8,7.1,2,63.3,0 2022-08-30 17:18:00,30,56,23.2,63,21.1,454.1,4.4,7.1,2,63.3,0 2022-08-30 17:36:50,30,78,20.6,90,18.8,1000.8,2.7,3.7,6,50.7,0 2022-08-30 17:47:15,41,0,-2125,0,768,5459,102.4,289.2,0,1364.4,48 2022-08-30 18:06:50,30,79,20.4,90,18.6,1000.4,2.7,3.4,8,50.7,0 2022-08-30 18:36:50,30,77,20.2,90,18.4,1000.5,2,3.1,10,50.7,0 2022-08-30 18:45:15,58,0,-2125,0,768,5356.7,256,264,0,1359.6,31 2022-08-30 19:06:50,30,77,20,91,18.3,1000.6,2.4,3.7,8,50.7,0 2022-08-30 19:26:15,41,0,-2125,0,768,5280,179.2,264.2,0,1355.1,27 2022-08-30 19:36:50,30,79,19.9,87,19.1,1001.2,3.4,4.4,10,50.7,0

Do you prefer manual user interaction for that, shall I close the pull request? I would think that avoiding a zero in the log() makes the software more robust.

jim-easterbrook commented 2 years ago

This sort of check is station specific. Your data shows problems with many values, not just the humidity. Filtering out bad data is best done in a user calibration module. Different stations fail in different ways, so I prefer not to weigh pywws down with checks that most users don't need.

Your data appears to be at 30 minute intervals (although most users prefer something shorter, typically 5 minutes) but the bad data appears at other times. There's something odd going on.