One sentence inside another

JuanFF commented 4 years ago

Hello,

I'm going to expand a custom training set with new examples. These two sentences are in the set:

Good morning I need help to fix this issue
I need help to fix this issue

In this case, I need that DeepSegment keeps the boundaries of the longest one (1). Considering that both examples are in training, I wonder if the final result would be

['Good morning', 'I need help to fix this issue']

I would like to avoid this but keep both examples as training. Would this be possible after training the model?

Thanks

bedapudi6788 commented 4 years ago

For this, I would keep both the examples in training set and let the model learn the good morning (or contextually similarly phrases) should not be split when accompanied by phrases like I need help to fix this issue.

So, keep both example 1 and example 2 in training set and train the model. Based on your results (i.e: if it segments sentences like 1), you might need to add more data similar to example 1.

JuanFF commented 4 years ago

Thanks a lot!

bedapudi6788 commented 4 years ago

@JuanFF I am closing the issue for now. Feel free to re-open if required.

notAI-tech / deepsegment

One sentence inside another #36