low-level policy optimization setup

jamesjjk commented 2 months ago

Very interesting paper. Incredibly insightful.

The paper specifically mentions:

" The coefficient 𝛼𝑙 of each sub-agent is tuned separately over {0, 1, 4} and selected based on the mean return rate of the validation subset with the same label of the agent."

However the .sh code seems to follow a different pattern for "vol" label_1 --alpha 4 --clf 'vol' label_2 --alpha 1 --clf 'vol' label_3 --alpha 1 --clf 'vol' Is this correct, could you comment on this?

nohup python -u RL/agent/low_level.py --alpha 1 --clf 'slope' --dataset 'ETHUSDT' --device 'cuda:0' \
    --label label_1 >.[/logs/low_level/ETHUSDT/slop](https://colab.research.google.com/drive/1v2wznlBL5cZgCimc-LyNafq2FLq8nmjf#)e_1.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 4 --clf 'slope' --dataset 'ETHUSDT' --device 'cuda:1' \
    --label label_2 >.[/logs/low_level/ETHUSDT/slop](https://colab.research.google.com/drive/1v2wznlBL5cZgCimc-LyNafq2FLq8nmjf#)e_2.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 0 --clf 'slope' --dataset 'ETHUSDT' --device 'cuda:2' \
    --label label_3 >.[/logs/low_level/ETHUSDT/slop](https://colab.research.google.com/drive/1v2wznlBL5cZgCimc-LyNafq2FLq8nmjf#)e_3.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 4 --clf 'vol' --dataset 'ETHUSDT' --device 'cuda:0' \
    --label label_1 >.[/logs/low_level/ETHUSDT/vol_](https://colab.research.google.com/drive/1v2wznlBL5cZgCimc-LyNafq2FLq8nmjf#)1.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 1 --clf 'vol' --dataset 'ETHUSDT' --device 'cuda:1' \
    --label label_2 >.[/logs/low_level/ETHUSDT/vol_](https://colab.research.google.com/drive/1v2wznlBL5cZgCimc-LyNafq2FLq8nmjf#)2.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 1 --clf 'vol' --dataset 'ETHUSDT' --device 'cuda:2' \
    --label label_3 >.[/logs/low_level/ETHUSDT/vol_](https://colab.research.google.com/drive/1v2wznlBL5cZgCimc-LyNafq2FLq8nmjf#)3.log 2>&1 &

jamesjjk commented 2 months ago

@ZONG0004 Any feedback?

ZONG0004 commented 2 months ago

Hi, Sorry for the late response. As mentioned, the hyperparameters provided in the scripts are tuned over {0,1,4}. We omit the selection process and provide the final result. You can also try to train all possible sub-agents using 0,1,4 as alpha separately and pick the best result based on the validation result on the corresponding subsets of different labels. The outcome shall be the same.

Hope this clarifies your questions.

jjk @.***> 于2024年8月27日周二 03:01写道：

Very interesting paper. Incredibly insightful.

The paper specifically mentions:

" The coefficient 𝛼𝑙 of each sub-agent is tuned separately over {0, 1, 4} and selected based on the mean return rate of the validation subset with the same label of the agent."

However the .sh code seems to follow a different pattern for "vol" label_1 --alpha 4 --clf 'vol' label_2 --alpha 1 --clf 'vol' label_3 --alpha 1 --clf 'vol' Is this correct, could you comment on this?

nohup python -u RL/agent/low_level.py --alpha 1 --clf 'slope' --dataset 'ETHUSDT' --device 'cuda:0' \ --label label_1 >./logs/low_level/ETHUSDT/slope_1.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 4 --clf 'slope' --dataset 'ETHUSDT' --device 'cuda:1' \ --label label_2 >./logs/low_level/ETHUSDT/slope_2.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 0 --clf 'slope' --dataset 'ETHUSDT' --device 'cuda:2' \ --label label_3 >./logs/low_level/ETHUSDT/slope_3.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 4 --clf 'vol' --dataset 'ETHUSDT' --device 'cuda:0' \ --label label_1 >./logs/lowlevel/ETHUSDT/vol1.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 1 --clf 'vol' --dataset 'ETHUSDT' --device 'cuda:1' \ --label label_2 >./logs/lowlevel/ETHUSDT/vol2.log 2>&1 &

nohup python -u RL/agent/low_level.py --alpha 1 --clf 'vol' --dataset 'ETHUSDT' --device 'cuda:2' \ --label label_3 >./logs/lowlevel/ETHUSDT/vol3.log 2>&1 &

— Reply to this email directly, view it on GitHub https://github.com/ZONG0004/MacroHFT/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANGVNFYYFOUOYLTMVLNJX2DZTN3SJAVCNFSM6AAAAABNETCYUWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4DONJVGUZTGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ZONG0004 / MacroHFT

low-level policy optimization setup #6