chaoss / grimoirelab-elk

GNU General Public License v3.0
59 stars 121 forks source link

[enriched/slack] Add stopwords in the field 'text_analyzed' #978

Closed zhquan closed 3 years ago

zhquan commented 3 years ago

This change allows update the settings using the method 'create_mappings' (elastic.py):

  1. close the index.
  2. update the settings with stopwords.
  3. open the index.

Signed-off-by: Quan Zhou quan@bitergia.com

coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 919024259


Files with Coverage Reduction New Missed Lines %
/home/runner/work/grimoirelab-elk/grimoirelab-elk/grimoire_elk/enriched/slack.py 1 96.64%
/home/runner/work/grimoirelab-elk/grimoirelab-elk/grimoire_elk/elastic.py 28 90.0%
/home/runner/work/grimoirelab-elk/grimoirelab-elk/grimoire_elk/utils.py 32 66.42%
/home/runner/work/grimoirelab-elk/grimoirelab-elk/grimoire_elk/enriched/enrich.py 110 73.72%
<!-- Total: 171 -->
Totals Coverage Status
Change from base Build 874815567: -0.1%
Covered Lines: 8667
Relevant Lines: 10537

💛 - Coveralls
ajaragz commented 3 years ago

Deployed in a test instance. With some Slack channels. Common words are not shown in the dashboard. LGTM

canasdiaz commented 3 years ago

I've built a slack enriched index with grimoirelab-0.2.55. When I execute micro.py with the code from this PR I get the following error:

(venv) luis@memento-mori ~/p/g/utils (master)> python3 micro.py --enrich --cfg ~/pull-978/configuration-pr --backends slack
  2021-05-26 23:34:17,419 Reading projects data from  /home/luis/pull-978/projects.json 
  2021-05-26 23:34:27,429 [slack] enrichment phase starts
  2021-05-26 23:34:27,446 [slack] enrichment starts for C044DUK7E
  2021-05-26 23:34:27,529 Error creating ES mappings {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Mapper for [text_analyzed] conflicts with existing mapping in other types:\n[mapper [text_analyzed] has different [analyzer]]"}],"type":"illegal_argument_exception","reason":"Mapper for [text_analyzed] conflicts with existing mapping in other types:\n[mapper [text_analyzed] has different [analyzer]]"},"status":400}. Mapping: 
        {
            "properties": {
                "text_analyzed": {
                  "type": "text",
                  "fielddata": true,
                  "analyzer": "my_stop_analyzer"
                }
           }
        } 
...
  2021-05-26 23:34:28,132 [slack] Done enrichment for https://slack.com/C85UMPADU
  2021-05-26 23:34:28,132 [slack] enrichment finished for C85UMPADU
  2021-05-26 23:34:28,132 [slack] enrichment starts for C8NTRHLKT
  2021-05-26 23:34:28,142 Error creating ES mappings {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Mapper for [text_analyzed] conflicts with existing mapping in other types:\n[mapper [text_analyzed] has different [analyzer]]"}],"type":"illegal_argument_exception","reason":"Mapper for [text_analyzed] conflicts with existing mapping in other types:\n[mapper [text_analyzed] has different [analyzer]]"},"status":400}. Mapping: 
        {
            "properties": {
                "text_analyzed": {
                  "type": "text",
                  "fielddata": true,
                  "analyzer": "my_stop_analyzer"
                }
           }
        } 

The versions I'm using are:

(venv) luis@memento-mori ~/p/g/utils (master)> pip3 freeze 
astroid==2.5.6
bandit==1.7.0
beautifulsoup4==4.9.3
cereslib==0.1.15
certifi==2020.12.5
cffi==1.14.5
chardet==3.0.4
colorlog==4.1.0
cryptography==3.4.7
decorator==4.4.2
dulwich==0.20.21
elasticsearch==6.3.1
elasticsearch-dsl==6.3.1
feedparser==6.0.2
file-read-backwards==2.0.0
flake8==3.9.2
geographiclib==1.50
geopy==2.1.0
gitdb==4.0.7
GitPython==3.1.17
graal==0.2.9
grimoire-elk==0.86.0
grimoirelab-panels==0.0.60
grimoirelab-toolkit==0.2.0
idna==2.8
isort==5.8.0
Jinja2==2.11.1
kidash==0.4.19
lazy-object-proxy==1.6.0
lizard==1.16.6
MarkupSafe==2.0.0
mccabe==0.6.1
networkx==2.5.1
numpy==1.18.3
pandas==0.25.3
patsy==0.5.1
pbr==5.6.0
perceval==0.17.6
perceval-finos==0.1.9
perceval-mozilla==0.2.12
perceval-opnfv==0.1.19
perceval-puppet==0.1.18
perceval-weblate==0.1.0
pycodestyle==2.7.0
pycparser==2.20
pydot==1.4.2
pyflakes==2.3.1
PyJWT==2.1.0
pylint==3.0.0a3
PyMySQL==0.9.3
pyparsing==3.0.0b2
python-dateutil==2.8.1
pytz==2021.1
PyYAML==5.4.1
requests==2.21.0
scipy==1.6.3
sgmllib3k==1.0.0
sirmordred==0.2.37
six==1.16.0
smmap==4.0.0
sortinghat==0.7.15
soupsieve==2.2.1
SQLAlchemy==1.3.24
statsmodels==0.12.2
stevedore==3.3.0
toml==0.10.2
urllib3==1.24.3
wrapt==1.12.1

CC @zhquan

zhquan commented 3 years ago

@sanacl are you using an existing enriched index or a new one?

If you are using an existing enriched index then the error is normal. Because we modified the mapping. You must use a new one and let SirMordred create the new mapping.

canasdiaz commented 3 years ago

@zhquan then I did not understand the use case. Please let me know whether I got it this time.

is this what you mean @zhquan ?

zhquan commented 3 years ago

@sanacl you have to:

If you use a new enriched index you should not see any errors.

But if you use an old enriched index you will get Error creating ES mappings