Open abaddon-moriarty opened 4 months ago
Found what was causing empty texts, the beginning variable turns into -1 when the keyword is not found, so that would select the entire text.
Instead of using if beginning:
I used if beginning > 1:
which seems to do the trick.
Will be updated in the next commit
If I separate the preprocessing and the main script, we can simply run pre-processing before everything else, outputing the cleaned texts in a separate folder, then we won't have to do it everytime we want to use main.py.