dadosfera / Bugsfera

Other
1 stars 0 forks source link

Pipeline with Jira not working properly #117

Closed rafaelsantanaep closed 1 year ago

rafaelsantanaep commented 1 year ago

Mandatory informations:

Are there any customers directly impacted by this bug? Which ones?

Bug Category

Describe the bug

The Jira connector is failing to extract data from Jira API using Dadosfera API Token

To Reproduce

Steps to reproduce the behavior:

  1. Create a pipeline using the Jira data source
  2. Wait for it to run
Expected behavior

The pipeline runs and is able to extract all the data we need

Does this bug impact any demo or a sale?

No

 

Other informations:

Any logs, error output, etc? … All the jobs fail with an error indicating that the parameters auth and domain were not provided. The issue is, our connector, does not use those parameters initially.

What environment of software are you using?

When the bug happened: 2023-08-16

rafaelsantanaep commented 1 year ago

While investigating, I discovered that we were using meltano add extractor tap-jira and, because of that, the installed tap has changed from https://github.com/singer-io/tap-jira to https://github.com/MeltanoLabs/tap-jira. Hence, the change in the contract of required parameters. I've forced the singer-connector to use the previous tap in the following PR

rafaelsantanaep commented 1 year ago

While testing the connector, now the connector successfully worked, but, has not collected anything. While doing some investigation, some issues arised:

  1. The API Token used by @allansene was incorrect (didn't return any results from the API).
  2. The version we were using had a "bug" that only works with API Tokens for on-premise deployments:
    def test_basic_credentials_are_authorized(self):
        # Make a call to myself endpoint for verify creds
        # Here, we are retrieving serverInfo for the Jira instance by which credentials will also be verified.
        # Assign True value to is_on_prem_instance property for on-prem Jira instance
        self.is_on_prem_instance = self.request("users","GET","/rest/api/2/serverInfo").get('deploymentType') == "Server"

So, we rollbacked to a previous version 2.1.1, which has a different implementation:

    def test_basic_credentials_are_authorized(self):
        # Make a call to myself endpoint for verify creds
        self.request("test", "GET", "/rest/api/2/myself")
rafaelsantanaep commented 1 year ago

Even with those changes, the problem of no records persisted, until when I changed the configuration we're using to use the properties singer capability instead of the catalog capability. I was able to spot it by looking at the sync function of the __init__.py of the tap.

The following piece:

@singer.utils.handle_top_exception(LOGGER)
def main():
    args = get_args()

    # Setup Context
    catalog = Catalog.from_dict(args.properties) \
        if args.properties else discover()
    Context.config = args.config
    Context.state = args.state
    Context.catalog = catalog

The args.properties is the old way of accessing the metadata about the streams, after changing it, I was able to run it locally in orchest.

Here is the PR fixing the bug

rafaelsantanaep commented 1 year ago

@allansene reported that he was able to get the data he wanted to.