issues
search
BCHSI
/
philter-ucsf
Open source clinical text de-identification
BSD 3-Clause "New" or "Revised" License
107
stars
50
forks
source link
fix Philter regex
#13
Open
katie-ta
opened
2 years ago
katie-ta
commented
2 years ago
summary
Changes include:
a few fixes so philter.py would run properly, without errors
fix all instances of letter-matching regular expressions to include accented characters
use
utf-8
encoding by default
ran into issues using the detected encoding on datasets that included accented characters; utf-8 works just as fine
summary
Changes include:
utf-8
encoding by default