A project with P-ai, where we create a resume helper that automatically scrapes data from job postings and helps make sure you put your best foot forward.
0
stars
0
forks
source link
remove unicode characters and create spaces in data that we tag #26
unicode characters (ex: \u87) sometimes show up in the data we tag, which throws off the model. additionally, sometimes words are joined by punctuation ("word.word") so we also need to handle that by replacing with spaces or some other method
hey @Kayala47
i would like to know regarding status of this issue
and if its still open i would like to try it as my first contribution.
Also can you please point me from where to start regarding this
regards
unicode characters (ex: \u87) sometimes show up in the data we tag, which throws off the model. additionally, sometimes words are joined by punctuation ("word.word") so we also need to handle that by replacing with spaces or some other method