chaoss / grimoirelab-elk

GNU General Public License v3.0
60 stars 121 forks source link

[qm-elk] Add support for QM data enrichment #892

Closed vchrombie closed 4 years ago

vchrombie commented 4 years ago

This PR adds a new enricher for handling the gitlab data used for quality models.

micro-mordred configurations

project.json

{
    "amfoss": {
        "gitlabqm:issue": [
            "https://gitlab.com/amfoss/cms-mobile"
        ],
        "gitlabqm:merge": [
            "https://gitlab.com/amfoss/cms-mobile"
        ]
    }
}

setup.cfg

[gitlabqm:issue]
api-token = xxxx
raw_index = gitlabqm-issues_raw
enriched_index = gitlabqm-issues_enriched
sleep-for-rate = true
category = issue
no-archive = true

[gitlabqm:merge]
api-token = xxxx
raw_index = gitlabqm-merge_raw
enriched_index = gitlabqm-merge_enriched
sleep-for-rate = true
category = merge_request
no-archive = true
vchrombie commented 4 years ago

This is an early version of the enricher.

The final target of this enricher is to enrich the gitlab items (issue, merge requests) into prosoul consumable format according to the metrics. vchrombie/gsoc#8.

vchrombie commented 4 years ago

Currently, the enricher works only for issues. All the issues are classified as issues opened on a particular date. I will be working on improving it to classify the issues into active-issues and closed-issues.

vchrombie commented 4 years ago

I couldn't upload these items to the es, as I was facing an error.

  2020-06-16 18:22:22,216 Error enriching raw from gitlabqm (https://gitlab.com/amfoss/cms-mobile): 'id'
Traceback (most recent call last):
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/elk.py", line 530, in enrich_backend
    enrich_count = enrich_items(ocean_backend, enrich_backend)
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/elk.py", line 318, in enrich_items
    total = enrich_backend.enrich_items(ocean_backend)
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/enriched/gitlabqm.py", line 166, in enrich_items
    ins_items += self.elastic.bulk_upload(items_to_enrich, self.get_field_unique_id())
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/elastic.py", line 336, in bulk_upload
    bulk_json += '{{"index" : {{"_id" : "{}" }} }}\n'.format(item[field_id])
KeyError: 'id'

I'm not sure what is the problem. :sweat_smile: The whole log of the enrich task, enrich-log.

vchrombie commented 4 years ago

I checked the output of the enriched items though. Each item looks something like this

  {
    'datetime': '2020-04-06T00:00:00+00:00' ,
    'metric_es_value': 9 ,
    'metric_es_value_weighted': 9 ,
    'project': 'cms-mobile' ,
    'metric_class': 'issues' ,
    'metric_type': 'LineChart' ,
    'metric_es_compute': 'sample' ,
    'metric_id': 'issues.numberOpenIssues' ,
    'metric_desc': 'The number of issues opened on a current date.' ,
    'metric_name': 'Number of Open Issues' ,
    'uuid': '23e0970c378e13131d75ecdcff4ac6f45b56583f'
  }

All the items (stored in items-to-enrich variable) can be viewed here, file-items-to-enrich.

valeriocos commented 4 years ago

I couldn't upload these items to the es, as I was facing an error.

  2020-06-16 18:22:22,216 Error enriching raw from gitlabqm (https://gitlab.com/amfoss/cms-mobile): 'id'
Traceback (most recent call last):
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/elk.py", line 530, in enrich_backend
    enrich_count = enrich_items(ocean_backend, enrich_backend)
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/elk.py", line 318, in enrich_items
    total = enrich_backend.enrich_items(ocean_backend)
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/enriched/gitlabqm.py", line 166, in enrich_items
    ins_items += self.elastic.bulk_upload(items_to_enrich, self.get_field_unique_id())
  File "/home/p0tt3r/chaoss/sources/grimoirelab-elk/grimoire_elk/elastic.py", line 336, in bulk_upload
    bulk_json += '{{"index" : {{"_id" : "{}" }} }}\n'.format(item[field_id])
KeyError: 'id'

I'm not sure what is the problem. The whole log of the enrich task, enrich-log.

I cannot reply on that comment, please check the comment at https://github.com/chaoss/grimoirelab-elk/pull/892#discussion_r440869120. In a nutshell, id is declared to identify the id of the document, but there is no field named id

coveralls commented 4 years ago

Pull Request Test Coverage Report for Build 2306


Files with Coverage Reduction New Missed Lines %
/home/travis/build/chaoss/grimoirelab-elk/grimoire_elk/enriched/github.py 2 75.06%
/home/travis/build/chaoss/grimoirelab-elk/grimoire_elk/enriched/githubql.py 6 96.98%
/home/travis/build/chaoss/grimoirelab-elk/grimoire_elk/enriched/mbox.py 12 76.15%
/home/travis/build/chaoss/grimoirelab-elk/grimoire_elk/utils.py 35 66.29%
<!-- Total: 55 -->
Totals Coverage Status
Change from base Build 2281: -0.5%
Covered Lines: 8345
Relevant Lines: 10260

💛 - Coveralls
vchrombie commented 4 years ago

I cannot reply on that comment, please check the comment at #892 (comment). In a nutshell, id is declared to identify the id of the document, but there is no field named id

Yes, I understood the problem and the mistake I did :facepalm: . Thanks for help. :slightly_smiling_face:

vchrombie commented 4 years ago

Hi @valeriocos I have addressed your comments. Would like to hear your comments over the current implementation. :smiley:

vchrombie commented 4 years ago

Hi @valeriocos I have worked on adding the support to handle merge requests and generate the metrics from that.

I have started working on the git enricher. I will open a PR soon for that. As discussed earlier, I will add this enricher in that PR and close this. Maybe, you can review the implementation at once then. :smiley:

valeriocos commented 4 years ago

Thank you @vchrombie for the updates!

Maybe, you can review the implementation at once then.

As you prefer :)

vchrombie commented 4 years ago

closing this PR, in reference to #902