Closed Jqq3482840604 closed 2 months ago
Sorry, could you please tell me if the baseline CQL in the article updates its offline buffer during the fine-tuning phase, or does it keep the buffer fixed?
Hi, we used a mixing ratio hyperparameter to mix the offline buffer and online buffer, as shown in Table 3 of the paper, for CQL
Sorry, could you please tell me if the baseline CQL in the article updates its offline buffer during the fine-tuning phase, or does it keep the buffer fixed?