snumprlab / cl-alfred

Official Implementation of CL-ALFRED (ICLR'24)
https://bhkim94.github.io/projects/CL-ALFRED/
GNU General Public License v3.0
18 stars 3 forks source link

Hello. Is each task (e.g. 'data/json_feat_2.1.0/look_at_obj_in_light-Box-None-FloorLamp-212/trial_T20190908_193427_340509') trained in args.temp_batchsize times ? #6

Closed wqshmzh closed 4 months ago

wqshmzh commented 4 months ago

I just find that the temp_batch (composed of several tasks) whose batch size is args.temp_batchsize is iterated args.temp_batchsize times. Would you mind explaining the reason why the arg "iterations" in "online_train" function equals to self.temp_batchsize, which is self.batch_size-2*self.batch_size//3 ? :-) I think the time of iterations of each task should be 1 since the title starts with "online continual learning".

dbd05088 commented 4 months ago

Hi @wqshmzh

Thank you for having an interest in our work!

For the online continual learning setup, there are many notions of online that differs in each literature. Some literatures [1, 2] refer to an online setup where each streamed sample is used only once to train a model, while other literatures [3, 4] refer to an online setup where one or a few samples are streamed at a time. We follow the latter definition since the former allows storing the whole current task data, which is similar to offline continual learning setup and less realistic. In this setup, while we cannot access and use previously encountered data for training except for data stored in episodic memory, we use streaming data multiple times for training at the time they are encountered.

Specifically, "online_iteration" refers to the number of batch iterations per sample encounter. While CLIB composes training batches by retrieving only from episodic memory, ER-based methods (e.g., ER, DER, MIR) compose training batches using both current stream data and data from episodic memory (e.g., in ER and MIR, half of the training batch is composed of current stream data). ER-based methods accumulate streaming data until the number of streaming data equals self.temp_batchsize and retrieve a size of (self.batchsize - self.temp_batchsize) from episodic memory. Since "online_iteration" refers to the number of iterations per sample encounter, we train the model using the training batch with (online_iteration * self.temp_batchsize).

[1] Prabhu et al., GDumb: A simple approach that questions our progress in continual learning, ECCV 2020 [2] Bang et al., Rainbow memory: Continual learning with a memory of diverse samples, CVPR 2021 [3] Aljundi et al., Gradient based sample selection for online continual learning, NeurIPS 2019 [4] Koh et al., Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference, ICLR 2022

wqshmzh commented 4 months ago

OK I see. And I have another concern that self.num_updates, which is "online_iteration", is accumulating itself by 1 at a time until len(temp_batch) == self.temp_batchsize in "def online_step". Why the "online_iteration" equals to self.temp_batchsize ? Is it supposed to be specified manually as a fixed value? It is appreciated if you wouldn't mind explaining the reason a little bit. Thank you very much.

dbd05088 commented 4 months ago

I can't quite understand the part about _"online_iteration" equals to self.tempbatchsize that you asked about, thus I'll explain the overall training process in more detail.

_onlineiter refers to the number of batch iterations per sample encounter, which is set to 1 by default. You can modify it, of course.

For ER-based methods, we accumulate samples until _len(temp_batch) == self.tempbatchsize as you mentioned. Then, we train the model with _num_iter (=len(self.temp_batch) * self.onlineiter) batch iterations.

Originally, when a new sample is encountered, the model should be trained with _self.onlineiter batch iterations. However, since we accumulate stream data and do not train the model until _len(temp_batch) == self.tempbatchsize, we train the model with _len(self.temp_batch) * self.onlineiter at that time to make up for the delayed training that couldn't be performed during accumulation.

If you have any concerns, feel free to ask more questions anytime! Thank you

wqshmzh commented 4 months ago

I am sorry for not asking very clearly. I set "CAMA" as the model. As we know, self.temp_batch is empty and self.num_updates is 0 at the beginning, in "online_step" function, the program executes self.temp_batch.append(sample), self.num_updates += self.online_iter and then exit the "online_step" function because len(self.temp_batch) != self.temp_batchsize at this time. Then, in "run_train" function, the program executes data = cur_train_data_list[data_idx] to receive another sample and then steps into cl_method.online_step(data, samples_cnt, task_process_bar) again to execute self.temp_batch.append(sample) and self.num_updates += self.online_iter again. When len(self.temp_batch)==self.temp_batchsize in "online_step" function, self.num_updates also equals to self.temp_batchsize because the length of self.temp_batch and self.num_updates accumulate by 1 at a time and the self.online_train function receives int(self.num_updates) as its argument "iterations".

JACK-Chen-2019 commented 4 months ago

I think he might be expressing that "iterations" should be set to a fixed value, representing the number of training times for each sample. The "iterations" should not be related to the "batch size"; when the "batch size" changes, the average number of training times for each sample would also change. This could potentially lead to performance differences when training with different batch sizes.

bhkim94 commented 4 months ago

Closing this issue due to inactivity.