joelansbro / pipeline

API Pipeline DB middleware
2 stars 0 forks source link

Cleanjobs.py apostrophe requirement #15

Open jagithub2 opened 2 years ago

jagithub2 commented 2 years ago

Cleanjob will need to do a few preprocessing jobs in order to be inserted correctly into the Database

Either we must come up with a preprocessing method of handling apostrophes within content (if unaccounted for will lead to an error within the SQLite transaction) or find a method of binding it in a way that can retain the apostrophes.

IE: content = 'it's me!' <- will error out after the s

We should ideally be able to preserve the apostrophes, as they could be valuable to research.

jagithub2 commented 2 years ago

This has been fixed by replacing the apostrophe ' with ` - this allows us to insert into sqlite database. I'm keeping this open as a sticky note in case something crops up in the future, whether on the NLP side of things, which requires us to revert the apostrophe back to normal