Closed Navaneethsen closed 3 years ago
Hi, generally the splitter will make some mistakes, it's a statistical model. For this specific text, you can fix the mistake by increasing the threshold (default is 0.8):
In [1]: from nnsplit import NNSplit
In [2]: splitter = NNSplit.load("en", threshold=0.99)
In [3]: splits = splitter.split(["What's working and what needs to change? Not e
...: verybody Dr.Jones, has the opportunity to watch themselves after they've
...: had a date to see what they're doing right or wrong, so that you will o
...: nly know what to do in the next day. Yeah, but it's such an important ex
...: ercise that they needed to do. Last week they went on their first date,
...: which is a huge step for our single wives, and a great time for us to wa
...: tch your dates.."])[0]
In [4]: [str(x) for x in splits]
Out[4]:
["What's working and what needs to change? ",
"Not everybody Dr.Jones, has the opportunity to watch themselves after they've had a date to see what they're doing right or wrong, so that you will only know what to do in the next day. ",
"Yeah, but it's such an important exercise that they needed to do. ",
'Last week they went on their first date, which is a huge step for our single wives, and a great time for us to watch your dates..']
but then it will miss some other splits, especially where punctuation is missing.
That said I am not entirely satisfied with the quality of the current model, I'll try some things to improve it.
Hi,
My sentence is as shown below:
What's working and what needs to change? Not everybody Dr.Jones, has the opportunity to watch themselves after they've had a date to see what they're doing right or wrong, so that you will only know what to do in the next day. Yeah, but it's such an important exercise that they needed to do. Last week they went on their first date, which is a huge step for our single wives, and a great time for us to watch your dates..
When I split it using nnsplit the split sentences are shown below:
I don't think this is right. Will you please let me know if these splits can be improved.