Text Preprocessing - Githubissues

SforAiDl / decepticonlp

Python Library for Robustness Monitoring and Adversarial Debugging of NLP models

MIT License

15 stars 10 forks source link

Text Preprocessing #56

Open someshsingh22 opened 4 years ago

someshsingh22 commented 4 years ago

To implement a common black box we need text loading, extraction of words to be attacked, perturbations, distance metrics, models.

Text Loading needs to be very uniform and universal, it should encapsulate all common practices including embedding, tokenizers, batch_loaders, and should support commonly used libraries like nltk, spacy, BERT etc.

We need to think about how we should design this before our first attack.

parantak commented 4 years ago

@someshsingh22 , I believe we could start by creating a common class for this inside decepticonlp/preprocess, and then start by implementing a class for each of the separate practices. Does that sound good, or do you want to take a different approach to this?

parantak commented 4 years ago

Refer to #75