Hey there,
So ive been using pysbd to detect boundries in hindi and marathi language and then save the same data rearranged from a paragraph to one sentence boundry per sample.
Unfortunately the storage size has gone down from 22GB to 14.5 GB after just detecting boundries and just saving them per sentence.
and yes i did turn off the clean args.
Hey there, So ive been using pysbd to detect boundries in hindi and marathi language and then save the same data rearranged from a paragraph to one sentence boundry per sample. Unfortunately the storage size has gone down from 22GB to 14.5 GB after just detecting boundries and just saving them per sentence. and yes i did turn off the clean args.