KennethEnevoldsen / augmenty

Augmenty is an augmentation library based on spaCy for augmenting texts.
https://kennethenevoldsen.github.io/augmenty/
MIT License
151 stars 11 forks source link

Resolve spancat dependencies in entity_replacer_v1 #185

Closed SrijithSrinath closed 1 year ago

SrijithSrinath commented 1 year ago

In the current function ent_augmenter_v1, resolves all dependencies without resolving the dependency on spancat objects. When we change the entities in the sentence, please help in resolving spancat labels like "sc", "spans" or custom label. The feature will allow running augmenty on a dataset that has both entities and spans. For example -

"My name is Srijith Srinath. I work in Spacy."

Ents - PERSON_NAME - Srijith Srinath WORKPLACE - Spacy

Spans - PERSON_DESC - My name is Srijith Srinath WORKPLACE_DESC - I work in Spacy

If we change the ent "Srijith Srianth" to "John Doe". The changes occur only on ents. Make changes to spans as well, as follows -

Ents - PERSON_NAME - John Doe WORKPLACE - Spacy

Spans - PERSON_DESC - My name is John Doe WORKPLACE_DESC - I work in Spacy

KennethEnevoldsen commented 1 year ago

Hi, @SrijithSrinath sorry for missing this (feel free to remind me next time). This sounds like a very reasonable addition. I will def. see if I can try to add it in.

SrijithSrinath commented 1 year ago

@KennethEnevoldsen I've actually solved this problem on my own repo. Would it be okay to actually do this PR myself? I will add you as a reviewer.

KennethEnevoldsen commented 1 year ago

@SrijithSrinath I would actually love a PR. I am more than happy to give a review and feedback

KennethEnevoldsen commented 1 year ago

Hi @SrijithSrinath I have added the functionality in #201, it will be merged it once the tests have passed.