nakamotoo / Cal-QL

official implementation for our paper Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
https://nakamotoo.github.io/Cal-QL
76 stars 5 forks source link

experiment #7

Closed Jqq3482840604 closed 2 months ago

Jqq3482840604 commented 4 months ago

Sorry, could you please tell me if the baseline CQL in the article updates its offline buffer during the fine-tuning phase, or does it keep the buffer fixed?

nakamotoo commented 3 months ago

Hi, we used a mixing ratio hyperparameter to mix the offline buffer and online buffer, as shown in Table 3 of the paper, for CQL