Closed suryapa1 closed 4 years ago
reprocess_input_data
controls whether the words are converted into features from scratch or whether cached features (from a previous run) should be loaded from disk. It's not pre-processing in the sense of lowercase, stemming, etc. (I also wouldn't recommend doing manual stemming with transformers, just use the raw text).
You can do any preprocessing outside simple transformers. Just do the preprocessing before you use any simple transformers classes/methods.
If your dataset doesn't change, you don't need to do reprocessing.
Thanks
Describe the bug Is it possible to mimic reprocess=True outside simple transformers itself
Any code sample is appreciated !!!
I want to reprocess =True before feeding into model train method, I want to process raw data using normal NLP cleansing process like lowercase, stemming, special characters treatments etc, What others extra steps can be additionally be done by enabling reprocess=True, Just I want to do those steps outside simple transformers itself.
Also, PLease recommend is it make sense to apply reprocess=True in general as am doing cleansing portion in general, here also I want to save some latency here