We have already added a command line argument (main.py) where users can specify the domains or datasets. What we need to do is we need to have each domain load a different Dataset class. Then, during each iteration of for set of sources in the powerset, we need to use torch Dataset's concat method to concatinate the multiple domains to create a single Dataset. If we have N domains then we will end up creating 2^N - 1 dataset classes, one per iteration.
Where this could happen
It seems to me that this might happen inside the run training fuction. That is, around the for ... in range(epochs) there will be something like for domain_subset in powerset:
This means that we will need to pass the Dataset classes into this function, not the dataloaders.
We can schedule a meeting to discuss this in detail.
Training logic for powerset
We have already added a command line argument (main.py) where users can specify the domains or datasets. What we need to do is we need to have each domain load a different Dataset class. Then, during each iteration of for set of sources in the powerset, we need to use torch Dataset's concat method to concatinate the multiple domains to create a single Dataset. If we have
N
domains then we will end up creating2^N - 1
dataset classes, one per iteration.Where this could happen
It seems to me that this might happen inside the run training fuction. That is, around the
for ... in range(epochs)
there will be something likefor domain_subset in powerset:
This means that we will need to pass the Dataset classes into this function, not the dataloaders.
We can schedule a meeting to discuss this in detail.