SugiharaLab / pyEDM

Python package of EDM tools
Other
112 stars 29 forks source link

For SMap, if dataframe doesn't have target values for pred indices, the class throws error. #55

Closed dineshws closed 1 month ago

dineshws commented 1 month ago

I am trying to use SMap for time series forecasting.

Basically, given an input dataframe. if we have lib = "startLib endLib" and pred="startPred endPred" Then I get good results only if both these ranges of indices have proper values WITHIN the input dataframe. But if I put the pred range beyond the input dataframe, it throws an error. So I append a dummy dataframe with zero values at the end of the input dataframe. such that the dummy dataframe has the same size in the pred ranges. Then the predictions are horrible.

If I following the tutorial properly on my data, I get proper results. In the tutorial, the dataframe has future values for target variables. But if I put future values as NaN , it throws error. If I put future values as 0.0, it produces junk results.

How should I prepare my dataframe properly so that forecasting can be achieved?

SoftwareLiteracy commented 1 month ago

Please provide a minimal example of the problem, and, version information.

The state space library is created from lib and predictions are made by finding nearest neighbors to each pred point in the state space, Thus as you identified, lib and pred must have data.

Forecasting beyond pred can be done with Tp > 0. For example:

>>> import pyEDM
>>> pyEDM.__version__
'2.0.3'
>>> df = pyEDM.sampleData['Lorenz5D']
>>> df.shape
(1000, 6)
>>> df.iloc[-3:,:]
      Time      V1      V2      V3      V4      V5
997  59.85  0.5780 -1.6804  3.7693  8.3641  4.3277
998  59.90 -0.8845 -1.2133  3.3424  9.2297  2.8772
999  59.95 -1.4639 -0.8408  3.0409  9.6364  1.0788
>>> sm = pyEDM.SMap( dataFrame = df, columns = 'V1', target = 'V1',
         lib = [1,500], pred = [997,1000], E = 5, Tp = 1, theta = 3. )
>>> sm['predictions']
    Time  Observations  Predictions  Pred_Variance
0  59.80        2.5553          NaN            NaN
1  59.85        0.5780     0.539239      16.484263
2  59.90       -0.8845    -0.900529      15.487685
3  59.95       -1.4639    -1.457071      11.789297
4  60.00           NaN    -1.109293       8.244481