chaoss / grimoirelab-perceval

Send Sir Perceval on a quest to retrieve and gather data from software repositories.
http://perceval.readthedocs.io/
GNU General Public License v3.0
290 stars 177 forks source link

Jira backend is not working for the correct email/password pair (only some Jira URLs not all) #603

Open lukaszgryglicki opened 4 years ago

lukaszgryglicki commented 4 years ago

Command: p2o.py --enrich --index sds-o-ran-documentation-jira-raw --index-enrich sds-o-ran-documentation-jira -e (redacted) --db-host mariadb.mariadb --db-sortinghat sortinghat --db-user sortinghat --db-password (redacted) jira https://jira.o-ran-sc.org --project DOC -u (redacted) -p (redacted) --no-archive --verify False:

2019-12-05 07:34:19,788 [jira] Incremental from: None for https://jira.o-ran-sc.org
2019-12-05 07:34:19,789 Looking for issues at site 'https://jira.o-ran-sc.org', in project 'DOC' and updated from '1970-01-01 00:00:00+00:00'
2019-12-05 07:34:19,845 Error feeding raw from jira (https://jira.o-ran-sc.org): 401 Client Error:  for url: https://jira.o-ran-sc.org/rest/api/2/field
Traceback (most recent call last):
  File "/repos/grimoirelab-elk/grimoire_elk/elk.py", line 228, in feed_backend
    ocean_backend.feed(**params)
  File "/repos/grimoirelab-elk/grimoire_elk/raw/elastic.py", line 234, in feed
    self.feed_items(items)
  File "/repos/grimoirelab-elk/grimoire_elk/raw/elastic.py", line 250, in feed_items
    for item in items:
  File "/repos/grimoirelab-perceval/perceval/backend.py", line 215, in fetch
    for item in self.fetch_items(category, **kwargs):
  File "/repos/grimoirelab-perceval/perceval/backends/core/jira.py", line 161, in fetch_items
    fields = json.loads(self.client.get_fields())
  File "/repos/grimoirelab-perceval/perceval/backends/core/jira.py", line 354, in get_fields
    req = self.fetch(url)
  File "/repos/grimoirelab-perceval/perceval/client.py", line 133, in fetch
    response = self._fetch_from_remote(url, payload, headers, method, stream, verify, auth)
  File "/repos/grimoirelab-perceval/perceval/client.py", line 176, in _fetch_from_remote
    raise e
  File "/repos/grimoirelab-perceval/perceval/client.py", line 169, in _fetch_from_remote
    response.raise_for_status()
  File "/usr/local/lib/python3.5/dist-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error:  for url: https://jira.o-ran-sc.org/rest/api/2/field
2019-12-05 07:34:19,846 [jira] Done collection for https://jira.o-ran-sc.org
2019-12-05 07:34:19,846 Backend feed completed
2019-12-05 07:34:19,913 [jira] 0 items inserted for Jira
2019-12-05 07:34:19,914 [jira] Done enrichment for https://jira.o-ran-sc.org
2019-12-05 07:34:19,914 Enrich backend completed
2019-12-05 07:34:19,914 Finished in 0.00 min

I was debugging this a lot and those credentails are 100% sure OK. I've taken two Jira instances, one is for Hyperledger and another one is for O-RAN. In both Jiras I have the same username say user1 and passowrd say password1.

Both Jiras authenticating correctly from UI using those credentials.

I've tracked this down to self.session.get(url,...) - it doesn't work for O-RAN and works for Hyperledger. But if I replace code with requests.Session().get(url, ...) it works for both (this is the first query that just gets a list of fields from the REST API endpoint: /rest/api/2/field) Now the difference is that the self.session object has self.session.auth containing user1, password1 pair while requests.Session() executes call without auth headers. My clue is that Jira for O-RAN is not supporting auth header while Jira for Hyperledger does.

This is the case indeed, checked by curl:

For hyperledger credentials work: curl --user user1:password1 https://jira.hyperledger.org/rest/api/2/field. For O-RAN they’re not working: curl --user user1:password1 https://jira.o-ran-sc.org/rest/api/2/field

Both those credentials work OK from the Jira UI. Seems like O-RAN Jira is passing those credentials in some other way than auth headers, and this is not supported by Grimoire backend...

The only difference I can see from the UI is that Hyperledger is not using SSO/LFID and O-RAN is using SSO/LFID.

How can I process Jira data by Perceval backend when jira uses SSO/LFID?


One more thing, error message is very very cryptic in this case and required python debugging using pdb and modifying sources, adding just a single line makes the problem obvious, this is also the case in many more grimoire backend - they return crypting 40X Client or Server errors and nothing more - no one will have any idea why they fail...

If you just edit grimoirelab-perceval/perceval/client.py file, def _fetch_from_remote(self, url, payload, headers, method, stream, verify, auth): function and just add + this code print(response.text):

        except Exception as e:
+            print(response.text)

You will get the exact cause of failure:

(...)
    <title>Unauthorized (401)</title>
(...)

                            <p>Encountered a <code>&quot;401 - Unauthorized&quot;</code> error while loading this page.</p>
                            <p>Basic Authentication Failure - Reason : AUTHENTICATED_FAILED</p>
(...)
AlexanderLill commented 3 years ago

This is probably due to the fact, that Jira has deprecated authentication with Username/Password for some instances, see here: https://confluence.atlassian.com/cloud/deprecation-of-basic-authentication-with-passwords-for-jira-and-confluence-apis-972355348.html

Unfortunately it seems that using an API token is not implemented yet?

Edit: It works creating a new API token and just using that token as password. So username is the mail address of the account, and password is the API token :)