GRAAL-Research / deepparse

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning
https://deepparse.org/
GNU Lesser General Public License v3.0
299 stars 30 forks source link

[BUG] Error while retraining #114

Closed gabryarri closed 2 years ago

gabryarri commented 2 years ago

I tried to retrain the model using new tags but this happened:

line 16, in nll_loss loss += criterion(pred[i], ground_truth[i]) IndexError: index 12 is out of bounds for dimension 0 with size 12

davebulaval commented 2 years ago

That looks like empty tags list for some addresses.

Try installing the dev version with the following command pip install -U git+https://github.com/GRAAL-Research/deepparse.git@dev. I've added some data tests (e.g. no empty tags list for an address) and improved error handling for such cases.

davebulaval commented 2 years ago

See the new release for better error handling.

gabryarri commented 2 years ago

I have successfully retrained the model but when I try to use it to do new predictions the program gives me this error:

AttributeError: 'NoneType' object has no attribute 'lower'

This is the code I wrote:

address_parser = AddressParser(path_to_retrained_model=my_path)

I also tried to specify the model type but nothing changes

davebulaval commented 2 years ago

It seems like some addresses are None value in your dataset. We apply a cleaning process onto the addresses (coma cleaning and lower cleaning).

Will improve the validation of dataset with this case also.