kilianknoll / DWDForecast

Python module to query the DWD Mosmix data for irridation - and other relevant data to forecast solar generation
GNU General Public License v3.0
34 stars 4 forks source link

Error processing DWDArray Length mismatch: Expected axis has 247 elements, new values have 248 elements #4

Closed plin2 closed 3 years ago

plin2 commented 3 years ago

Since a couple of days the download/import of DWD data isn't working. I the the following messages:

Starting dwdforecast init ... I am looking for data from DWD for the following station: 10517 I will be polling the following URL for the latest updates https://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10517/kml From dwdforecast - initial queue population 1603369793.521941 Waiting on DWD dwdforecastdata Queue results to tell it is started... From Main : DWD File access I checked / got uploaded by DWD was at : 1603369793.521941 2020-10-22T14:29:53.521Z Here is what we got from DWD : Starting dwdforecast init ... I have set my DB connection I am looking for data from DWD for the following station: 10517 I will be polling the following URL for the latest updates http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10517/kml From dwdforecast - initial queue population 1603369832.0948741 Waiting on DWD dwdforecastdata Queue results to tell it is started... From Main : DWD File access I checked / got uploaded by DWD was at : 1603369832.0948741 2020-10-22T14:30:32.094Z Interaction is Simple - processing once only Error processing DWDArray Length mismatch: Expected axis has 247 elements, new values have 248 elements Closing thread & exiting Thread is going down ...

kilianknoll commented 3 years ago

I tried to run with the current version of the script - and the station you specified but cannot reproduce the issue. Here is what I see as output: _python dwdforecast.py Starting dwdforecast init ... I am looking for data from DWD for the following station: 10517 I will be polling the following URL for the latest updates http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10517/kml From dwdforecast - initial queue population 1603426865.7387545 Waiting on DWD dwdforecastdata Queue results to tell it is started... From Main : DWD File access I checked / got uploaded by DWD was at : 1603426865.7387545 2020-10-23T06:21:05.738Z Interaction is Simple - processing once only After Weather check.... Closing thread & exiting Thread is going down ...

What I noted from your initial post was a reference to dwd in the url with http only - and one with https. I propose to only work via http requests. I have attached the modified / adjusted configuration file. Could you please use the current version of the script, run it with the attached configuration file - and in case you still see the issue please also provide the debug output ? configuration.txt

plin2 commented 3 years ago

Hi,

the https call comes from another programm running within the same script: #!/bin/bash cd /opt/fhem/python

# Daten für FHEM holen und eintragen #/usr/bin/python3 /opt/fhem/python/bin/dwd_get_forecast.py 192.168.3.10 10517 find . -mtime 1 -name "MOSMIX*.kml" -exec rm {} ';'

# Daten für MySQL aufbereiten und laden /usr/bin/python3 /opt/fhem/python/bin/dwdforecast.py

The output of dwdforecast.py is

Starting dwdforecast init ... I have set my DB connection I am looking for data from DWD for the following station: 10517 I will be polling the following URL for the latest updates http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10517/kml From dwdforecast - initial queue population 1603430423.9112914 Waiting on DWD dwdforecastdata Queue results to tell it is started... From Main : DWD File access I checked / got uploaded by DWD was at : 1603430423.9112914 2020-10-23T07:20:23.911Z Interaction is Simple - processing once only Error processing DWDArray Length mismatch: Expected axis has 247 elements, new values have 248 elements Closing thread & exiting Thread is going down ...

Attached are configuratio.ini and dwd_debug.txt

configuration.ini.txt dwd_debug.txt

kilianknoll commented 3 years ago

Hmm ... Length mismatch is an indicator of the Pandas Array being "somehow" out of sync. Unfortunately I can´t reproduce it. However I have just updated a new version where I have included debug statements that may tell us where those inconsistencies might come from. Could you please rerun with the updated version and provide the resulting output - as well as the debug logfile ? Also does this issue occur every time you run the script - or is it a sporadic error ? PS: no changes were done to the configuration.ini

plin2 commented 3 years ago

I did it the simple way and added some print statements. The relevant changes are:

                    print ("step3")

                    #Gathering time series from start - and end hours (240 rows):
                    self.local_timestamp= pd.date_range(start=self.first, end=self.last, freq='1h',tz=self.mytimezone)

                    self.PandasDF['Rad1wh'] = 0.277778*self.PandasDF.Rad1h

                    self.PandasDF['Rad1Energy'] = self.mysimplemultiplicationfactor*self.PandasDF.Rad1wh

                    print ("step3a: ", self.local_timestamp)
                    print ("step3b: ", self.PandasDF.index)
                    self.PandasDF.index = self.local_timestamp
                    print ("step3c: ", self.PandasDF.index)
                    print ("step4")

Running the script shows following output:

Starting dwdforecast init ... I am looking for data from DWD for the following station: 10517 I will be polling the following URL for the latest updates http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10517/kml From dwdforecast - initial queue population 1603547875.3012154 Waiting on DWD dwdforecastdata Queue results to tell it is started... From Main : DWD File access I checked / got uploaded by DWD was at : 1603547875.3012154 2020-10-24T15:57:55.301Z Interaction is Simple - processing once only step1 step2 step3 step3a: DatetimeIndex(['2020-10-24 10:00:00+02:00', '2020-10-24 11:00:00+02:00', '2020-10-24 12:00:00+02:00', '2020-10-24 13:00:00+02:00', '2020-10-24 14:00:00+02:00', '2020-10-24 15:00:00+02:00', '2020-10-24 16:00:00+02:00', '2020-10-24 17:00:00+02:00', '2020-10-24 18:00:00+02:00', '2020-10-24 19:00:00+02:00', ... '2020-11-03 07:00:00+01:00', '2020-11-03 08:00:00+01:00', '2020-11-03 09:00:00+01:00', '2020-11-03 10:00:00+01:00', '2020-11-03 11:00:00+01:00', '2020-11-03 12:00:00+01:00', '2020-11-03 13:00:00+01:00', '2020-11-03 14:00:00+01:00', '2020-11-03 15:00:00+01:00', '2020-11-03 16:00:00+01:00'], dtype='datetime64[ns, Europe/Berlin]', length=248, freq='H') step3b: RangeIndex(start=0, stop=247, step=1) Error processing DWDArray Length mismatch: Expected axis has 247 elements, new values have 248 elements Closing thread & exiting Thread is going down ...

So statement self.PandasDF.index = self.local_timestamp is causing the error.

DatetimeIndex is supposed to have a length of 248, but

sylvester:/opt/fhem/python # grep "" MOSMIX_L_2020102409_10517.kml | wc -l 247

What next?

kilianknoll commented 3 years ago

Peter I was trying to identify the problem - but no success - used your configuration files, tried to somehow break it etc. I did make one related change though: After some more investigations of the dwd website, I realized that all of the dwd timestamps are in UTC (former GMT) timezone. I have thus made changes to also perform the calculations with pvlib in the same timezone.

plin2 commented 3 years ago

ok, that's it.

Thanks Kilian