oceanprotocol / pdr-backend

Instructions & code to run predictoors, traders, more.
Apache License 2.0
28 stars 22 forks source link

[Bug, sim plots] Sometimes plots briefly disappear and say 'Error/waiting: cannot unpack non-iterable NoneType object` #1272

Closed trentmc closed 3 months ago

trentmc commented 3 months ago

Summary of issue, and solution

Hypothesis: sometimes a new st_*.yyy.pkl file appears and a new aimodel_plotdata_*.yyy.pkl hasn't appeared yet, and it prevents pdr sim_plots from rendering.

Likely solution:

Details

Sometimes when I'm running simulation engine pdr sim and accompanying plots pdr sim_plots, all plots in the browser window will disappear to be replaced with a message "Error/waiting: cannot unpack non-iterable NoneType object". (Full screenshot below.) Right before that the browser tab says "Updating..."

It tends to happen more often when the model-building takes longer, eg with max_n_train: 10000, autoregressive_n: 20.

It also sometimes happens if I stop the simulation engine, by ctrl-C'ing the process in the pdr sim window but letting the pdr sim_plots process run. And really what that should do is show the most recent results. I say "sometimes happens" because maybe 80% of the time it does work.

It also sometimes happens if I ctrl-C the pdr sim_plots process and re-start it. (80% of the time it does work.)

FYI here is the info in the sim_state directory:

Datapoint: If I delete the most recent file (st_20240621_104412.727.pkl), then pdr sim_plots successfully renders. Presumably using the second-most-recent file.

Towards a solution

Maybe the .pkl data gets corrupted now and then, depending when I ctrl-c?

Full screenshot

Screenshot 2024-06-21 at 10 44 32

calina-c commented 3 months ago

@trentmc yes, both aimodel_plotdata_*.pkl and st_*.pkl are needed and could be interrupted by hitting ctrl+c. A workaround we could have is to somehow detect it files are older than i.e. 2-3 seconds, but still do not unpack, it means the process was interrupted and the state file remained corrupted. In that case, we can discard the file and use the next-to-last. I always keep the last and next-to-last pair of files (aimodel and st) specifically for this case.

trentmc commented 3 months ago

Time based detection feels brittle.

Perhaps the plotter needs both the aimodelplotdata.pkl file and st_.pkl file

Why not: once the newest version of both files are detected, then it's ok to delete older versions

calina-c commented 3 months ago

Time based detection feels brittle.

Perhaps the plotter needs both the aimodel_plotdata.pkl file and st.pkl file

Why not: once the newest version of both files are detected, then it's ok to delete older versions

Because the pickling process takes some time. The file already exists before the pickling finishes. That's why we rely on the next-to-last which has most certainly finished pickling.

What I think is happening here due to the error message: "st_x" exists, while "aimodel_plotdata_x" does not. This is because the process was interrupted by hitting ctrl+c. The plotter tries to get "aimodel_plotdata_x" because the existence of "st_x" indicates the corresponding plotdata exists as well.

As far as I can tell, there are some ways to fix this issue.

  1. Advise users that interrupting the process can result in dirty states and they should delete them.
  2. When plotting, check that the entire pair, not just st_x exists at a given moment in time
trentmc commented 3 months ago

[option 2] When plotting, check that the entire pair, not just st_x exists at a given moment in time

Let's do this option.

Option 1 isn't quite relevant, because (a) the issue also exists when not ctrl-c'ing (b) the main "users" are us, and we know that instabilities can occur under a ctrl-c. However under a ctrl-c, the plotter should still be stable, if possible... and we do have a way:)