It fails whether parquet_dir had data previously, or not.
Relevant ppss.yaml settings
Some ppss.yaml values: (the default)
```text
lake_ss:
parquet_dir: parquet_data
feeds:
- binance BTC/USDT 1h
st_timestr: 2023-06-01_00:00 # starting date for data
fin_timestr: now # ending date for data
...
predictoor_ss:
predict_feed: binance BTC/USDT c 1h
bot_only:
s_until_epoch_end: 60 # in s. Start predicting if there's > this time left
stake_amount: 1 # stake this amount with each prediction. In OCEAN
approach3:
aimodel_ss:
input_feeds:
- binance BTC/USDT
max_n_train: 5000 # no. epochs to train model on
autoregressive_n : 10 # no. epochs that model looks back, to predict next
approach: LIN
Full traceback
(venv) trentmc@tlm-macbook: ~/code/pdr-backend $ pdr sim ppss.yaml
dftool sim: Begin
Arguments:
PPSS_FILE=ppss.yaml
Start run
Get historical data, across many exchanges & pairs: begin.
Data start: timestamp=1685577600000, dt=2023-06-01_00:00:00.000
Data fin: timestamp=1704967729680, dt=2024-01-11_10:08:49.680
Update all rawohlcv files: begin
Update rawohlcv file at exchange=binance, pair=BTC/USDT: begin
filename=/Users/trentmc/code/pdr-backend/parquet_data/binance_BTC-USDT_1h.parquet
No file exists yet, so will fetch all data
Aim to fetch data from start time: timestamp=1685577600000, dt=2023-06-01_00:00:00.000
Fetch up to 1000 pts from timestamp=1685577600000, dt=2023-06-01_00:00:00.000
newest_ut_value: 1689174000000
Fetch up to 1000 pts from timestamp=1689177600000, dt=2023-07-12_16:00:00.000
newest_ut_value: 1692774000000
Fetch up to 1000 pts from timestamp=1692777600000, dt=2023-08-23_08:00:00.000
newest_ut_value: 1696374000000
Fetch up to 1000 pts from timestamp=1696377600000, dt=2023-10-04_00:00:00.000
newest_ut_value: 1699974000000
Fetch up to 1000 pts from timestamp=1699977600000, dt=2023-11-14_16:00:00.000
newest_ut_value: 1703574000000
Fetch up to 1000 pts from timestamp=1703577600000, dt=2023-12-26_08:00:00.000
Just saved df with 5387 rows to new file /Users/trentmc/code/pdr-backend/parquet_data/binance_BTC-USDT_1h.parquet
Update rawohlcv file at exchange=binance, pair=BTC/USDT: done
Update all rawohlcv files: done
Load rawohlcv file.
Get historical data, across many exchanges & pairs: done.
Traceback (most recent call last):
File "/Users/trentmc/code/pdr-backend/./pdr", line 6, in <module>
cli_module._do_main()
File "/Users/trentmc/code/pdr-backend/venv/lib/python3.11/site-packages/enforce_typing/decorator.py", line 29, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/trentmc/code/pdr-backend/pdr_backend/cli/cli_module.py", line 44, in _do_main
func(args)
File "/Users/trentmc/code/pdr-backend/venv/lib/python3.11/site-packages/enforce_typing/decorator.py", line 29, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/trentmc/code/pdr-backend/pdr_backend/cli/cli_module.py", line 55, in do_sim
sim_engine.run()
File "/Users/trentmc/code/pdr-backend/venv/lib/python3.11/site-packages/enforce_typing/decorator.py", line 29, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/trentmc/code/pdr-backend/pdr_backend/sim/sim_engine.py", line 92, in run
self.run_one_iter(test_i, mergedohlcv_df)
File "/Users/trentmc/code/pdr-backend/venv/lib/python3.11/site-packages/enforce_typing/decorator.py", line 29, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/trentmc/code/pdr-backend/pdr_backend/sim/sim_engine.py", line 106, in run_one_iter
X, y, _ = model_data_factory.create_xy(mergedohlcv_df, testshift)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/trentmc/code/pdr-backend/pdr_backend/aimodel/aimodel_data_factory.py", line 84, in create_xy
assert hist_col in mergedohlcv_df.columns, f"missing data col: {hist_col}"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: missing data col: binance:BTC/USDT:None
To reproduce
(In yaml-cli2 branch)
ppss.yaml had default settings. Details below.
Run sim_engine:
Then it fails. Full traceback below.
It fails whether
parquet_dir
had data previously, or not.Relevant ppss.yaml settings
Full traceback