intel / dffml

The easiest way to use Machine Learning. Mix and match underlying ML libraries and data set sources. Generate new datasets or modify existing ones with ease.
https://intel.github.io/dffml/main/
MIT License
253 stars 138 forks source link

docs: nlp: Trailing whitespaces and broken plugin docs #818

Closed sakshamarora1 closed 3 years ago

sakshamarora1 commented 4 years ago
docs/tutorials/dataflows/nlp.rst:20: trailing whitespace.
+These words are called `StopWords`. 
examples/dataflow/chatbot/configs.ini:6: new blank line at EOF.
model/spacy/dffml_model_spacy/ner/ner_model.py:158: trailing whitespace.
+    
model/spacy/dffml_model_spacy/ner/ner_model.py:226: trailing whitespace.
+    The location of the function is passed using: 
model/spacy/examples/ner/accuracy.sh:9: trailing whitespace.
+  -log debug 
model/spacy/examples/ner/predict.sh:9: trailing whitespace.
+  -log debug 
model/spacy/examples/ner/train.sh:9: trailing whitespace.
+  -log debug 
operations/nlp/dffml_operations_nlp/operations.py:135: trailing whitespace.
+    
operations/nlp/dffml_operations_nlp/operations.py:188: trailing whitespace.
+        A spacy model with the capability of parsing. Sentence 
operations/nlp/dffml_operations_nlp/operations.py:223: trailing whitespace.
+    Converts a collection of text documents to a matrix of token counts using sklearn CountVectorizer's `fit_transform` method. 

Screenshot from 2020-07-28 22-17-40

johnandersen777 commented 4 years ago

The main issue here is that scripts/docs.py is using the ----- underline somewhere and therefore sphinx rst parsing is deciding that ----- is a second level header.

johnandersen777 commented 4 years ago

Also that whitespace checker is broken: #232

We fixed - in last commit

johnandersen777 commented 4 years ago

What are you using to check for trailing whitespace?

sakshamarora1 commented 4 years ago

@pdxjohnny I have pre-commit git hook that runs this when I do git merge master or git commit :-

exec git diff-index --check --cached $against --

It was already there in dffml/.git/hooks/pre-commit

Also can be reproduced using the commit hash before the NLP operations commit:-

git diff c3e6fb259650c7b17278f15749792429964e30c6 --check