Closed DamianUS closed 2 years ago
Dear @npatki thank you in advance for your support! I´m having a similar issue. Here I describe it:
Environment Details
SDV version: 0.16.0 Python version: 3.8.13 Operating System: Windows 10
Error: Exception has occurred: KeyError 'Time'
The above exception was the direct cause of the following exception:
File "C:\Users\Data_Augmentation\PAR_Model.py", line 13, in
import pandas as pd
from sdv.timeseries import PAR
data = pd.read_pickle('df_PAR.pkl')
context_columns = ['POM', 'Mold Temperature [°C]', 'Injection velocity [cmm/s]', 'Holding pressure [bar]']
entity_columns = ['id']
sequence_index = 'Time'
model = PAR(entity_columns=entity_columns, context_columns=context_columns, sequence_index=sequence_index)
model.fit(data)
new_data = model.sample(1)
model.save('Timeseries_synthetic_model.pkl')
Attached you will find .py file and .pkl file with data PS: I tried to reproduce the example shown here: https://sdv.dev/SDV/user_guides/timeseries/par.html but I can´t access the file. I wanted to check the type of data variables.
https://github.com/sdv-dev/SDV/issues/808#issuecomment-1133123852
I understand what´s going on. My Time column is float-type, PAR allows only Data-Time type though...
@yamidibarra, I'm having the same issue. Time column needing to be in date time format.
Hi everyone,
Yes @yamidibarra, I agree with you. Issue #808 is likely the root cause for all these errors: It is a known issue that the PAR model currently produces a sampling error when sequence_index
is numerical (float, int). The error should go away if you express sequence_index
as a datetime or if you remove it altogether.
Does this accurately describe everyone's scenario? If so, I can close this issue in favor of #808 for tracking.
BTW --
@DamianUS, thanks for filing this issue! I will delete the comments in #935 since you copied it over here
@yamidibarra, re the link:
PS: I tried to reproduce the example shown here: https://sdv.dev/SDV/user_guides/timeseries/par.html but I can´t access the file. I wanted to check the type of data variables.
The text of the link is correct by the hyperlink is pointing to some other URL. You should be able to open the page if you click on this: https://sdv.dev/SDV/user_guides/timeseries/par.html.
Hi everyone,
Yes @yamidibarra, I agree with you. Issue #808 is likely the root cause for all these errors: It is a known issue that the PAR model currently produces a sampling error when
sequence_index
is numerical (float, int). The error should go away if you expresssequence_index
as a datetime or if you remove it altogether.Does this accurately describe everyone's scenario? If so, I can close this issue in favor of #808 for tracking.
yes, it resolves this specific issue. Here my workaround. I´ll open up another issue regarding the synthetic data. I have some questions and I would appreciate your opinion dear @npatki
data = pd.read_pickle('df_PAR.pkl')
data['Time'] = data['Time'].multiply(1E9)
data['Time'] = pd.to_datetime(data['Time'])
context_columns = ['POM', 'Mold Temperature [°C]', 'Injection velocity [cmm/s]', 'Holding pressure [bar]']
entity_columns = ['id']
sequence_index = 'Time'
model = PAR(entity_columns=entity_columns, context_columns=context_columns, sequence_index=sequence_index)
model.fit(data)
new_data = model.sample(1)
# get seconds
new_data['Time']=new_data['Time'].apply(lambda x:'%02d.%06d' %(x.second, x.microsecond)).astype(float)
Great, thanks for confirming! I'll close this issue in favor of #808.
Please feel free to reply if you continue to see a KeyError
on the PAR model even if you have a datetime sequence_index
and I can reopen this issue for discussion.
Hi, I am facing the same KeyError issue in PARsynthesizer as here, even though sequence_index
is datetime. Please see the issue #1510.
p.s. the KeyError that I get is from the context_columns
Great, thanks for confirming! I'll close this issue in favor of #808.
Please feel free to reply if you continue to see a
KeyError
on the PAR model even if you have a datetimesequence_index
and I can reopen this issue for discussion.
@mohammedsabiya Thanks for filing! We'll follow up in the new issue, as it's been some time since this original one was resolved.
Environment Details
Please indicate the following details about the environment in which you found the bug:
SDV version: 0.16.0 Python version: 3.8.13 (default, May 8 2022, 17:48:02) \n[Clang 13.1.6 (clang-1316.0.21.2)] Operating System: Macbook Pro M1 Mac OS X 12.0.1
Error description
The key error is also being raised when trying to sample from a freshly-trained PAR model in v0.16.0.
I tried both passing the field types metadata and without it, nothing seems to help.
I printed the model metadata just to check if the model inferred properly the data types and everything seems correct.
Here I attach the code used just in case it helps (this is the last version used in which the model infers the field types):
When trying to sample:
Maybe I'm not doing something properly. I'm new to the library!