Open dtraskas opened 6 years ago
So expressions like "Flat 12", etc. are usually not part of building-level data sets like OSM. However, since it was important for the parser to handle sub-building information, for 1.0 we generated a variety of those types of expressions randomly per-language/per-country (including some Scotland-specific patterns like "TR" for Top Right, etc.) and append them to the base address. That said, there are still probably a number of real-world patterns that are missing in our data.
If there's a specific pattern it's not handling correctly, we can just generate the pattern. Can you provide a few examples of what's not working?
Hi,
I tried your awesome library for parsing UK addresses and had good accuracy results so far however there are still problems with Scottish addresses, especially flats. I have noticed that you do not recommend training the models with our own data but instead contribute datasets so that you can potentially do that? Correct me if I am wrong with that but I am keen to improve the parser with more UK based data.