dvgodoy / PyTorchStepByStep

Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"
https://pytorchstepbystep.com
MIT License
798 stars 305 forks source link

Missing input in helpers.py for function "make_balanced_sampler" and WeightedRandomSampler? #15

Open minertom opened 3 years ago

minertom commented 3 years ago

I just downloaded the latest full zip file.

While running the jupyter notebook for Chapter05, I noticed a couple of errors, relating to

` in 22 23 # Builds a weighted random sampler to handle imbalanced classes ---> 24 sampler = make_balanced_sampler(y_train_tensor) 25 26 # Uses sampler in the training set to get a balanced data loader

~/projects/pytorchstepbystep_2/PyTorchStepByStep-master/helpers.py in make_balanced_sampler(y) 79 num_samples=len(sample_weights), 80 generator=generator, ---> 81 replacement=True 82 ) 83 return sampler

TypeError: init() got an unexpected keyword argument 'generator' `

I think that the issue stems from the "WeightedRandomSampler" function, which might be missing an input. From the pytorch documentation CLASStorch.utils.data.WeightedRandomSampler(weights: Sequence[float], num_samples: int, replacement: bool = True, generator=None)

That would be 4 inputs. "replacement" and "generator" are satisfied but either "weights" or "num_samples" seems to be missing from the call to WeightedRandomSampler from make_balanced_sampler. My guess would be that "num_samples" is the missing input.

Is that the case?

Thank You Tom

dvgodoy commented 3 years ago

Hi Tom,

The message "TypeError: init() got an unexpected keyword argument 'generator'" has the answer - the call to WeightedRandomSapler has an extra unexpected generator argument.

But, as you've pointed out, this argument is indeed in the documentation. I believe the issue is your PyTorch version - the generator argument was added in version 1.6, so if you're running into this message, I'd suggest you to upgrade your PyTorch.

To check which version you're using, just run: import torch print(torch.__version__)

Please let me know if this solves your issue.

Best, Daniel

minertom commented 3 years ago

Hi Danial,

Thank you for your response. It took me a day to get to it but I did check my pytorch version. import torch print(torch.version) 1.7.0

I do not believe that I can upgrade to a more current version. Before I do anything foolish, would you recommend downgrading the pytorch version? Which version did you use?

Regards Tom

On Tue, Dec 22, 2020 at 8:34 AM Daniel Voigt Godoy notifications@github.com wrote:

Hi Tom,

The message "TypeError: init() got an unexpected keyword argument 'generator'" has the answer - the call to WeightedRandomSapler has an extra unexpected generator argument.

But, as you've pointed out, this argument is indeed in the documentation. I believe the issue is your PyTorch version - the generator argument was added in version 1.6, so if you're running into this message, I'd suggest you to upgrade your PyTorch.

To check which version you're using, just run: `import torch

print(torch.version)`

Please let me know if this solves your issue.

Best, Daniel

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dvgodoy/PyTorchStepByStep/issues/15#issuecomment-749640285, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHGGHFRH4BELCLRAKJACILSWDDC3ANCNFSM4VDQGRSQ .

dvgodoy commented 3 years ago

Hi Tom,

I see... it should have worked then - I tested it using both 1.6 and 1.7 versions. So, no need to downgrade, but we need to dig deeper to figure out what's happening:

1) In the notebook for Chapter 4, in "Helper Function #5" section, there is a call to that function: sampler = make_balanced_sampler(y_train_tensor) Did it work for you?

2) I assumed the error you reported hapened in Chapter 5, "Data Preparation" section, which has the same code: sampler = make_balanced_sampler(y_train_tensor) Is this where the error happened to you?

3) In the notebook that's raising the error, please run the following code: from torch.utils.data import WeightedRandomSampler WeightedRandomSampler?

This will show the signature for this function. It should look like this: Init signature: WeightedRandomSampler( weights, num_samples, replacement=True, generator=None, ) Does it? If it is showing something different (that is, without the generator argument), it may be some weird virtual environment / configuration issue.

4) In the signature, there is also a "file" information, like: File: ~/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/data/sampler.py In my case, it is running in the torch environment as indicated by /envs/torch. Does the environment showing up in your signature matches the one you're using for running everything? That's somewhat of a long-shot, but you never know...

Having this information, we'll see what to do next :-)

Best, Daniel

minertom commented 3 years ago

Daniel,

Odd, very odd. I have no explanation.

Before I had a chance to look at your email reply, I thought that I would have one more go at solving the problem for myself. I opened the notebook, chapter05, again, and started running the cells, one by one. Yes, the same error was still happening.

This is where things get a little strange. Thinking that I could edit the helpers.py file to do some prints of things like the count etc, I loaded the helpers file in my IDE. Then, I ran the notebook again. To my surprise, the error went away. So, I closed the IDE and ran the notebook again. The error was gone. I have no explanation. Somehow, "touching" the file seemed to fix it.

I unarchived the zip file into a new directory and ran the notebook. No error.

I then read your response and loaded the notebook for Chapter04. I had run this before and had noticed no error. This time, there was no error either.

With no explanation, other than gremlins, I can no longer reproduce the error. So, i can get back to doing the learning from this chapter. I will let you know if things change.

Thank You Tom

dvgodoy commented 3 years ago

Hi,

Well, the best kind of error is the kind that corrects itself, right? :-) The important thing is, it is working!

Best, Daniel