PyThaiNLP / pythainlp

Thai natural language processing in Python
https://pythainlp.org/
Apache License 2.0
984 stars 274 forks source link

Repo's code have Inconsistent code format with the contribution guide #744

Closed new5558 closed 2 years ago

new5558 commented 2 years ago

Description

In the project's CONTRIBUTING.md, it is stated that the project Follows PEP8, use black with --line-length = 79; When I try to contribute to the project and follows this instruction, it is not.

Expected results

On branch dev
Your branch is up to date with 'origin/dev'.

nothing to commit, working tree clean

Current results

On branch dev
Your branch is up to date with 'origin/dev'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   pythainlp/augment/lm/fasttext.py
        modified:   pythainlp/augment/lm/wangchanberta.py
        modified:   pythainlp/augment/word2vec/__init__.py
        modified:   pythainlp/augment/word2vec/bpemb_wv.py
        modified:   pythainlp/augment/word2vec/core.py
        modified:   pythainlp/augment/word2vec/ltw2v.py
        modified:   pythainlp/augment/word2vec/thai2fit.py
        modified:   pythainlp/augment/wordnet.py
        modified:   pythainlp/benchmarks/word_tokenization.py
        modified:   pythainlp/cli/data.py
        modified:   pythainlp/cli/soundex.py
        modified:   pythainlp/cli/tag.py
        modified:   pythainlp/cli/tokenize.py
        modified:   pythainlp/corpus/__init__.py
        modified:   pythainlp/corpus/core.py
        modified:   pythainlp/corpus/oscar.py
        modified:   pythainlp/corpus/tnc.py
        modified:   pythainlp/corpus/ttc.py
        modified:   pythainlp/corpus/wordnet.py
        modified:   pythainlp/generate/__init__.py
        modified:   pythainlp/generate/core.py
        modified:   pythainlp/generate/thai2fit.py
        modified:   pythainlp/parse/__init__.py
        modified:   pythainlp/parse/core.py
        modified:   pythainlp/parse/esupar_engine.py
        modified:   pythainlp/parse/spacy_thai_engine.py
        modified:   pythainlp/parse/transformers_ud.py
        modified:   pythainlp/soundex/core.py
        modified:   pythainlp/soundex/prayut_and_somchaip.py
        modified:   pythainlp/spell/__init__.py
        modified:   pythainlp/spell/core.py
        modified:   pythainlp/spell/phunspell.py
        modified:   pythainlp/spell/symspellpy.py
        modified:   pythainlp/summarize/core.py
        modified:   pythainlp/summarize/freq.py
        modified:   pythainlp/summarize/mt5.py
        modified:   pythainlp/tag/_tag_perceptron.py
        modified:   pythainlp/tag/chunk.py
        modified:   pythainlp/tag/crfchunk.py
        modified:   pythainlp/tag/named_entity.py
        modified:   pythainlp/tag/pos_tag.py
        modified:   pythainlp/tag/thai_nner.py
        modified:   pythainlp/tag/thainer.py
        modified:   pythainlp/tag/tltk.py
        modified:   pythainlp/tag/unigram.py
        modified:   pythainlp/tag/wangchanberta_onnx.py
        modified:   pythainlp/tokenize/__init__.py
        modified:   pythainlp/tokenize/core.py
        modified:   pythainlp/tokenize/nercut.py
        modified:   pythainlp/tokenize/nlpo3.py
        modified:   pythainlp/tokenize/oskut.py
        modified:   pythainlp/tokenize/sefr_cut.py
        modified:   pythainlp/tokenize/tcc.py
        modified:   pythainlp/tokenize/tcc_p.py
        modified:   pythainlp/tokenize/thaisumcut.py
        modified:   pythainlp/tokenize/tltk.py
        modified:   pythainlp/tools/misspell.py
        modified:   pythainlp/translate/__init__.py
        modified:   pythainlp/translate/core.py
        modified:   pythainlp/translate/en_th.py
        modified:   pythainlp/translate/th_fr.py
        modified:   pythainlp/translate/zh_th.py
        modified:   pythainlp/transliterate/__init__.py
        modified:   pythainlp/transliterate/iso_11940.py
        modified:   pythainlp/transliterate/spoonerism.py
        modified:   pythainlp/transliterate/thai2rom.py
        modified:   pythainlp/transliterate/thaig2p.py
        modified:   pythainlp/transliterate/tltk.py
        modified:   pythainlp/transliterate/w2p.py
        modified:   pythainlp/ulmfit/__init__.py
        modified:   pythainlp/ulmfit/preprocess.py
        modified:   pythainlp/util/keyboard.py
        modified:   pythainlp/util/normalize.py
        modified:   pythainlp/util/strftime.py
        modified:   pythainlp/util/syllable.py
        modified:   pythainlp/util/thai.py
        modified:   pythainlp/util/trie.py
        modified:   pythainlp/util/wordtonum.py
        modified:   pythainlp/wangchanberta/core.py
        modified:   pythainlp/word_vector/core.py

no changes added to commit (use "git add" and/or "git commit -a")

Steps to reproduce

  1. git clone https://github.com/PyThaiNLP/pythainlp && cd pythainlp
  2. pip install black
  3. black --line-length=79 pythainlp
  4. git status

Context

Your environment

Possible solution

This can be resolved temporarily by creating one PR that formats all codes back to the standard. A more permanent solution may be to update the current GitHub Action to run black and throws errors if unformatted code detected

I can help open PR for both if you like.

Files

github-actions[bot] commented 2 years ago

Hello @new5558, thank you for your interest in our work!

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

wannaphong commented 2 years ago

Thank you 👍 I don't it should limit line length at 80 chars or not but I agree that the code should follows CONTRIBUTING.md. You can open PR.

wannaphong commented 2 years ago

It's look like pep8speaks bot are dead.