Hi Chelsea! In you paper, during meta training process, the inner gradients(theta') are calculated independently for different tasks within one batch. Then the updated parameters are applied to the new samples to calculate the outer gradients(theta). However, in your code maml.py - task_metalearn(), I didn't see independent calculation of the inner gradients for different tasks within a batch. Can I ask why?
The map_fn parallelizes across tasks in the batch. task_metalearn corresponds computation for a single task. Hence, the gradient computation is performed separately for each task.
Hi Chelsea! In you paper, during meta training process, the inner gradients(theta') are calculated independently for different tasks within one batch. Then the updated parameters are applied to the new samples to calculate the outer gradients(theta). However, in your code maml.py - task_metalearn(), I didn't see independent calculation of the inner gradients for different tasks within a batch. Can I ask why?