Another common abbreviation for Level

jasonrig / address-net

A package to structure Australian addresses

MIT License

194 stars 86 forks source link

As with issue #5 I think this will come down to tuning the address generation code used during training. The reason your approach of adding L to lookups.py didn't work is that the model is assigning a high probability of L being a unit number prefix, and a high probability of 900 and 9 both being the unit number itself (so the code concatenates them since it groups letters of the same class).

Perhaps issue #5 and #6 can be worked on together since they both would involve tuning the address synthesis code.

If you wanted to have a play around and see if anything works for you, I would try to work through the synthesise_address function. It's a little ugly, but you can see that for a given clean record from GNAF, it mutates the string in various ways while keeping track of what each character's class (unit type, street name, etc.) is. The goal would be to include more examples of the cases you're seeing that fail when performing these random mutations.

jasonrig / address-net

Another common abbreviation for Level #6