tristandeleu / pytorch-meta

A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch
https://tristandeleu.github.io/pytorch-meta/
MIT License
1.99k stars 256 forks source link

How do I have torchmeta use different data examples for each meta-batch when looping through the entire dataloader? #112

Open brando90 opened 3 years ago

brando90 commented 3 years ago

I want to loop through a meta-dataloader multiple times (which I've been able to do easily, especially in regression tasks) but when I loop the second time I want the data examples to be different. Will torchmeta do that automatically?

e.g. if I have a regression problem with 200 functions and my meta-batch size is 200 then I'd get all the 200 functions each time I get 1 batch of the meta-data loader. e.g.

print('-- start analysis --')
print(f'number of workers = {args.num_workers}')
print(f'--> args.meta_batch_size = {args.meta_batch_size_eval}')
print(f'--> args.iters = {args.iters}')
print(f'--> args.nb_inner_train_steps = {args.nb_inner_train_steps}')

print(meta_dataloader.batch_size)
# looping through the data set multiple times with different examples: https://github.com/tristandeleu/pytorch-meta/issues/112
with tqdm(range(args.iters)) as pbar:
    it = 0
    while it < args.iters:
        for batch_idx, batch in enumerate(meta_dataloader):
            # batch has a batch of tasks e.g. 200 regression functions or
            print(f'it = {it}')
            # print(batch['train'][0].size(0))
            # print(batch['train'][1].size(0))
            spt_x, spt_y, qry_x, qry_y = process_meta_batch(args, batch)
            print(spt_x.mean())
            print(qry_x.mean())

I want that each time it increases by 1 that I get 200 tasks but each has different examples.

brando90 commented 3 years ago

answer might be here... let see if I can extract it from this giant discussion: https://github.com/tristandeleu/pytorch-meta/issues/69

brando90 commented 3 years ago

confirmed that the current code I have is sampling the same set of examples :(

-- start analysis --
number of workers = 0
--> args.meta_batch_size = 200
--> args.iters = 5
--> args.nb_inner_train_steps = 1
200
 20%|██        | 1/5 [00:00<00:03,  1.10it/s]it = 0
tensor(-0.0510)
tensor(-0.0304)
it = 1
tensor(-0.0510)
tensor(-0.0304)
 60%|██████    | 3/5 [00:02<00:01,  1.20it/s]it = 2
tensor(-0.0510)
tensor(-0.0304)
it = 3
tensor(-0.0510)
tensor(-0.0304)
100%|██████████| 5/5 [00:03<00:00,  1.27it/s]
it = 4
tensor(-0.0510)
tensor(-0.0304)
brando90 commented 3 years ago

seems if I do exactly 1 less than the total number of tasks then I do get different examples:

-- start analysis --
number of workers = 0
--> args.meta_batch_size = 199
--> args.iters = 5
--> args.nb_inner_train_steps = 1
199
 20%|██        | 1/5 [00:00<00:03,  1.08it/s]
it = 0
tensor(-0.0486)
tensor(-0.0302)
it = 1
tensor(-0.5242)
tensor(-0.0590)
it = 2
tensor(-0.0503)
tensor(-0.0307)
it = 3
tensor(-0.1983)
tensor(0.0430)
100%|██████████| 5/5 [00:02<00:00,  1.99it/s]
it = 4
tensor(-0.0530)
tensor(-0.0305)

which seems bizarre to me.

brando90 commented 3 years ago

putting the random hash trick didn't work (though I just feel I am trying random things until something works):

meta_dataloader.__hash__ = lambda x: random.randrange(1 << 32)

print(meta_dataloader.batch_size)
# looping through the data set multiple times with different examples: https://github.com/tristandeleu/pytorch-meta/issues/112
with tqdm(range(args.iters)) as pbar:
    it = 0
    while it < args.iters:
        for batch_idx, batch in enumerate(meta_dataloader):
            # batch has a batch of tasks e.g. 200 regression functions or
            print(f'\nit = {it}')
            # print(batch['train'][0].size(0))
            # print(batch['train'][1].size(0))
            spt_x, spt_y, qry_x, qry_y = process_meta_batch(args, batch)
            print(spt_x.mean())
            print(qry_x.mean())

got:

-- start analysis --
number of workers = 0
--> args.meta_batch_size = 200
--> args.iters = 5
--> args.nb_inner_train_steps = 1
200
 20%|██        | 1/5 [00:00<00:03,  1.06it/s]
it = 0
tensor(-0.0510)
tensor(-0.0304)
it = 1
tensor(-0.0510)
tensor(-0.0304)
 60%|██████    | 3/5 [00:02<00:01,  1.16it/s]
it = 2
tensor(-0.0510)
tensor(-0.0304)
it = 3
tensor(-0.0510)
tensor(-0.0304)
100%|██████████| 5/5 [00:04<00:00,  1.20it/s]
it = 4
tensor(-0.0510)
tensor(-0.0304)
brando90 commented 3 years ago

Ok this is probably why I am seeing the issue of the same examples being sampled (https://github.com/tristandeleu/pytorch-meta/issues/69#issuecomment-707610673):

Also note that it doesn't mean that the same 20 images are always used for all tasks involving a specific class. See the second part of #67 (comment) for an example. In that sense, when shuffle=True, all the images from the dataset get a chance to be sampled for some task. The one-to-one correspondance only means that if you sample the same task (i.e. the same tuple of classes for the classification task), you'll get the same images.

In principle, if we have the same task sampled (e.g. same set of classes) then we get the same example so I sample zebra, car and we sample that again we get the same set of examples.

However, the issue that I am having is that when we sample 1 task (so one example in the meta-batch) that should correspond to one 1 function in regression. So if I sample 199 or 200 functions (but 200 is the entire set of classes), each of them are still 1 task. 200 are not 1 task. Thus, it should not matter if I sample a meta-batch size of 1 or 200 or whatever, it should always give the same example, which seems a bug in torch meta? For now I am happy with the bug since it means I can get different examples by switching the meta batch size to 199 but I believe that should work. Can you confirm this @tristandeleu is this a bug?

(note that the case we use all the classes in classification yields the issue of the same task always being sampled, but in the regression case all the functions are still a different set of functions so even if I sample all the functions we should get different examples is my interpretation of how it should work but I am ok with this bug because I can get different examples when I use 1 function less than the total number of functions)

brando90 commented 3 years ago

my ideal solution is for it to work with C=N (all classes/functions e.g. 64 or 200) and get different examples.

tristandeleu commented 3 years ago

Having the same data when sampling a task multiple times, even in regression problems, is a design choice in Torchmeta for reproducibility which I explained in #69, so that is not a bug. You are right, the random hash trick (which should be applied to the Task object, and not the dataloader) should not help in your case, because you want a fixed set of functions, but different samples for each function at every iterations (as opposed to different sets of functions).

There is no way of doing that out of the box in Torchmeta, and you would need to create your own dataset to have this features. Taking the example of Sinusoid, the simplest fix would be to not pass np_random when getting the task here: https://github.com/tristandeleu/pytorch-meta/blob/389e35ef9aa812f07ce50a3f3bd253c4efb9765c/torchmeta/toy/sinusoid.py#L84-L86 (remove np_random, or set np_random=None).

brando90 commented 3 years ago

Having the same data when sampling a task multiple times, even in regression problems, is a design choice in Torchmeta for reproducibility which I explained in #69, so that is not a bug. You are right, the random hash trick (which should be applied to the Task object, and not the dataloader) should not help in your case, because you want a fixed set of functions, but different samples for each function at every iterations (as opposed to different sets of functions).

There is no way of doing that out of the box in Torchmeta, and you would need to create your own dataset to have this features. Taking the example of Sinusoid, the simplest fix would be to not pass np_random when getting the task here: https://github.com/tristandeleu/pytorch-meta/blob/389e35ef9aa812f07ce50a3f3bd253c4efb9765c/torchmeta/toy/sinusoid.py#L84-L86

(remove np_random, or set np_random=None).

cool, will see how this applies to my data set. Thanks for the reply!

Will reply (and hopefully close the issue once I have it working for me and let you know).