issues
search
woodbri
/
address-standardizer
An address parser and standardizer in C++
Other
7
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Postgresql 11.6: Compiling Error
#43
stamogis
opened
4 years ago
1
Make Alternate Token search optional via postgres wrappers and API
#42
woodbri
opened
5 years ago
0
Notes on branch grammartrie
#41
woodbri
opened
8 years ago
0
Review Tokenizer::splitToken
#40
woodbri
opened
8 years ago
1
Create a template trie class and use it to replace std::map usage
#39
woodbri
opened
8 years ago
0
Create script to load all lexicon and grammar files into DB table
#38
woodbri
closed
8 years ago
1
Rename all lexicon and grammar files based on the iso country code
#37
woodbri
closed
8 years ago
1
Optimize Regex patterns from Lexicon
#36
woodbri
closed
8 years ago
1
Redesign Grammar and Search to improve preformance
#35
woodbri
closed
8 years ago
1
Performance Evaluation and Improvements
#34
woodbri
opened
8 years ago
2
Review how PUNCT chars are classified
#33
woodbri
opened
8 years ago
1
Tokenizer does not split off ° symbol
#32
woodbri
closed
8 years ago
1
Notes for Documentation
#31
woodbri
opened
8 years ago
0
Portugal Grammar might need some work
#30
woodbri
opened
8 years ago
0
Parsing 'KM 15,500' needs to be figured out
#29
woodbri
closed
8 years ago
2
These take too long to standardize
#28
woodbri
opened
8 years ago
1
Lexicon and Grammar files with UTF-8 BOM issues
#27
woodbri
opened
8 years ago
0
Problems with German and splitting tokens
#26
woodbri
closed
8 years ago
1
Infinite recursion in grammar check and search
#25
woodbri
closed
8 years ago
1
Add support for intersections
#24
woodbri
opened
8 years ago
0
Tokenizer: Improving word splitting process
#23
woodbri
closed
8 years ago
1
Add Query level caching of standardizer in Postgresql
#22
woodbri
closed
8 years ago
1
The grammar needs to support rules with no output tokens
#21
woodbri
closed
8 years ago
0
The grammar needs to support optional rules
#20
woodbri
opened
8 years ago
0
Add InClass::COMMA so grammars can key on this
#19
woodbri
closed
8 years ago
0
Add the ability to cache the standardizer in the database
#18
woodbri
opened
8 years ago
3
Search class needs to be extended
#17
woodbri
closed
8 years ago
0
Tokenizer treats space as punct token
#16
woodbri
closed
8 years ago
1
Handle joining words split by emdash token
#15
woodbri
closed
8 years ago
1
Look into breaking the lexicon regex in multiple smaller regex
#14
woodbri
closed
8 years ago
2
Convert data from lex, gaz and rules to new formats
#13
woodbri
closed
8 years ago
2
Design and Implement postgresql wrappers
#12
woodbri
closed
8 years ago
3
Change search from a single match to look for all possible matches
#11
woodbri
closed
8 years ago
1
Add method to analyze grammars
#10
woodbri
closed
8 years ago
1
Handle multiple identical adjacent tokens
#9
woodbri
closed
8 years ago
1
Create Unit tests for all classes
#8
woodbri
closed
8 years ago
2
Generate Country Specific Grammar Files
#7
woodbri
closed
8 years ago
1
Add Logging Facility
#6
woodbri
opened
8 years ago
0
Allow regex in Lexicon Entries
#5
woodbri
opened
8 years ago
0
Redesign LexEntry/Token classes to allow them to support a list of classifications
#4
woodbri
closed
8 years ago
1
Redesign Tokenizer to work with Lexicon
#3
woodbri
closed
8 years ago
1
operator <<
#2
cvvergara
closed
8 years ago
1
inclass.*
#1
cvvergara
closed
8 years ago
1