ImageItemList gives wrong type target tensors

terribilissimo commented 5 years ago

This is somewhat related with:

https://github.com/fastai/fastai/issues/839

and:

https://forums.fast.ai/t/error-pytorch-expected-object-of-scalar-type-long-but-got-scalar-type-float-for-argument-2-other/33778/3

Complete debug message:

RuntimeError                            Traceback (most recent call last)
<ipython-input-82-47df9df23a95> in <module>
----> 1 learn.fit_one_cycle(5)

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     20     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
     21                                         pct_start=pct_start, **kwargs))
---> 22     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     23 
     24 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    170         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    171         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 172             callbacks=self.callbacks+callbacks)
    173 
    174     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
     95     finally: cb_handler.on_train_end(exception)
     96 

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     87             if not data.empty_val:
     88                 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
---> 89                                        cb_handler=cb_handler, pbar=pbar)
     90             else: val_loss=None
     91             if cb_handler.on_epoch_end(val_loss): break

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
     52             if not is_listy(yb): yb = [yb]
     53             nums.append(yb[0].shape[0])
---> 54             if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
     55             if n_batch and (len(nums)>=n_batch): break
     56         nums = np.array(nums, dtype=np.float32)

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/callback.py in on_batch_end(self, loss)
    237         "Handle end of processing one batch with `loss`."
    238         self.state_dict['last_loss'] = loss
--> 239         stop = np.any(self('batch_end', not self.state_dict['train']))
    240         if self.state_dict['train']:
    241             self.state_dict['iteration'] += 1

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/callback.py in __call__(self, cb_name, call_mets, **kwargs)
    185     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    186         "Call through to all of the `CallbakHandler` functions."
--> 187         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
    188         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
    189 

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/callback.py in <listcomp>(.0)
    185     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    186         "Call through to all of the `CallbakHandler` functions."
--> 187         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
    188         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
    189 

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/callback.py in on_batch_end(self, last_output, last_target, **kwargs)
    272         if not is_listy(last_target): last_target=[last_target]
    273         self.count += last_target[0].size(0)
--> 274         self.val += last_target[0].size(0) * self.func(last_output, *last_target).detach().cpu()
    275 
    276     def on_epoch_end(self, **kwargs):

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/metrics.py in error_rate(input, targs)
     45 def error_rate(input:Tensor, targs:Tensor)->Rank0Tensor:
     46     "1 - `accuracy`"
---> 47     return 1 - accuracy(input, targs)
     48 
     49 

~/anaconda3/envs/dmx1/lib/python3.7/site-packages/fastai/metrics.py in accuracy(input, targs)
     28     input = input.argmax(dim=-1).view(n,-1)
     29     targs = targs.view(n,-1)
---> 30     return (input==targs).float().mean()
     31 
     32 

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'other'

How I acquired the data, created my Learner:

src = (ImageItemList.from_csv(path, 'stuff.csv', folder='.')
       .random_split_by_pct(0.2)
       .label_from_df(cols=colz))

data = (src.transform(tfms, size=299)
        .databunch().normalize(imagenet_stats)
       )

learn = create_cnn(data, models.resnet18, metrics=error_rate, ps=0.70, loss_func=nn.BCEWithLogitsLoss())

The error pops up right after the first epoch, just as the first validation cycle starts. Note that I implemented the suggestion provided by S. Gugger in issue #839

jph00 commented 5 years ago

Based on the code you've provided, this doesn't seem to be an issue with code on this repo (we don't have anything called "stuff.csv". Please reopen if it is, and provide a link to the code on this repo which is not working correctly.

terribilissimo commented 5 years ago

Hi Jeremy. stuff.csv is just my csv with labeling for my dataset. If you don't mind, I'll reopen it, proving a link to relevant code.

jph00 commented 5 years ago

This repo is for issues in the course materials. You should use the forum for your issue, since it isn't an issue in the course materials.

terribilissimo commented 5 years ago

You are right, I did open this issue here by mistake. I was on the verge of opening a new issue on the main fastai repo (not course). Do you prefer that I stick with the forum post I already started and not to open an issue? Consider however that the post is two weeks old, but the problem still persists.

I'll wait for your reply before opening a new issue on the fastai repo.

Thanks.

On Sun, Jan 27, 2019 at 1:00 PM Jeremy Howard notifications@github.com wrote:

This repo is for issues in the course materials. You should use the forum for your issue, since it isn't an issue in the course materials.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fastai/course-v3/issues/141#issuecomment-457911710, or mute the thread https://github.com/notifications/unsubscribe-auth/AhqNqlRW-Bcpw57Qn0gHu67f2AqtJaXZks5vHZTYgaJpZM4Z7b-1 .

jph00 commented 5 years ago

Yes you should stick to the forum thread. The answer there seems to be the correct one.

terribilissimo commented 5 years ago

Yes, the answer is correct and it solved my problems, indeed I write my custom metric before my trainings. But souldn't we patch the code in fastai? Consider that it happens as one acquires the data via csv or df. As you acquire the data from (for example) folders, the error does not manifest itself. I wonder if you (fastai team) encounter the same issue as you acquire data for your experiments from csv/df.

On Sun, Jan 27, 2019 at 2:51 PM Jeremy Howard notifications@github.com wrote:

Yes you should stick to the forum thread. The answer there seems to be the correct one.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fastai/course-v3/issues/141#issuecomment-457919725, or mute the thread https://github.com/notifications/unsubscribe-auth/AhqNqhKqbZvY_pOi0Q4XFYfrC5CPqpXKks5vHa70gaJpZM4Z7b-1 .

fastai / course-v3

ImageItemList gives wrong type target tensors #141