marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

Multilingual perturbations? #45

Closed kevinrobinson closed 4 years ago

kevinrobinson commented 4 years ago

Thanks for your work! perturb.by is for English only, any plans for the API to support other languages?

marcotcr commented 4 years ago

The perturb function is general, but it's the only one. I don't plan on adding functions for other languages, unless you have a suggestion for functions that are general and would work for any language. I guess I could try writing change_names and change_location in such a way that it works with other languages without NER, but it would not be as good. Is that what you had in mind?

kevinrobinson commented 4 years ago

@marcotcr thanks! I was looking at the methods using spacy annotations and figuring they'd work in multilingual settings, while methods like add_negation and expand_contractions are only for English, but implementations could be written in similar ways to build out support for other languages.

I'm mostly asking about what the scope of the project is and if you're continuing to develop it in any of those directions, so it's helpful to know that this isn't something you'll work on. Thanks for sharing this awesome work 👍

marcotcr commented 4 years ago

I am trying to develop it in other directions that I think are more urgent : )

kevinrobinson commented 4 years ago

@marcotcr Yeah that's cool, I don't mean any judgment by asking :) I'd be excited to read more about what directions you are working on when you're ready to share! 👍

marcotcr commented 4 years ago

I know, I didn't take your questions negatively at all :)