Closed artrune closed 2 years ago
Hi @artrune
Interesting issue. I need to know the version of numpy / pyaf you are using. An anonymized version of your data and/or a script that fails are also welcome.
The basic requirements are here :
https://github.com/antoinecarme/pyaf/blob/master/ISSUE_TEMPLATE.md
Here's a subset of the data I was using:
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "temperature",
"columns": [
"time",
"value"
],
"values": [
[
"2021-08-01T15:19:40.791108016Z",
43.5901031494
],
[
"2021-08-01T15:18:26.407537578Z",
43.5352668762
],
[
"2021-08-01T15:17:12.252326253Z",
43.6632194519
],
[
"2021-08-01T15:15:57.981777501Z",
43.4987106323
],
[
"2021-08-01T15:14:43.74242866Z",
43.5535430908
],
[
"2021-08-01T15:13:29.624277764Z",
43.5718231201
],
[
"2021-08-01T15:12:15.401747322Z",
43.7363357544
],
[
"2021-08-01T15:11:01.165682994Z",
43.480430603
],
[
"2021-08-01T15:09:46.742506101Z",
43.4621505737
],
[
"2021-08-01T15:08:32.613850591Z",
43.5535430908
]
]
}
]
}
]
}
Here are the versions of the packages:
numpy: 1.19.2
pandas: 1.2.4
pyaf: 2.0.1
I was loading the data into a dataframe using:
df_result = pd.DataFrame(json['results'][0]['series'][0]['values'], columns=['time', 'value'])
df_result ['date'] = pd.to_datetime(df_result ['time'], utc=True)
training was failing so I added the cast
df_result ['date']=df_result ['date'].values.astype('<M8[s]')
@artrune
Thanks for the data. I will see what I can do with that.
Thanks, I have no real interest in dealing with nanoseconds, the underlying database I was polling (influx 1.8) stores dates with that precision.
I remember trying to remove the utc flag, I don't think it helped.
Ok for the nanoseconds,. Will help with prioritizing this issue. Seems to be a numpy issue as you noted in the first commit.
As long as we have a workaround, this bug will be fixed in PyAF 4.0 (release date : July 2022 ;).
Thanks for this feedback. I have been interested at some time (long time ago) in IoT and Time series usage with PyAF
https://github.com/antoinecarme/pyaf/issues/3
Do you have some pointers/demos or notebook links to your usage of PyaF with InfluxDB ?
To be honest I wasn't using any demo data, I just happened to have been collecting my raspberry pi temperature from some months now, if you want that data I can share it without any problem.
@artrune
LOL. So, just by playing with your Raspberry Pi and a Time Series database, you are helping fix a problem for PyAF and probably numpy users. Don't underestimate the power of usage !!!
Thank you very much for helping make PyAF better and don't hesitate to come back and report similar issues.
Your Raspberry Pi rocks !!!
Antoine
Absolutely! I'll let my RaspberryPi know he's a good boy!
Thanks and stay safe-
I was testing on some data, and I kept getting exceptions saying train failed, after looking around I realized it was because of the high precision dates.
I changed the precision by casting my dates up to seconds and then train worked fine:
df['date'] = df['date'].values.astype('<M8[s]')
Seems that the underlying problem is some numpy function, not too sure..