MeltanoLabs / tap-gitlab

Singer.io Tap for extracting data from Gitlab's API
GNU Affero General Public License v3.0
8 stars 25 forks source link

Add missing data in streams #71

Closed laurentS closed 1 year ago

laurentS commented 2 years ago

This is a duplicate of #68 (now closed) from inside the repo to test CI access to env vars.

This PR updates several streams to use the sdk schema definition format instead of JSON files and:

The tap's output did not necessarily match the data returned by the api, so this might seem to break compatibility, but I think the missing fields were null anyway.

sonarcloud[bot] commented 2 years ago

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 3 Code Smells

No Coverage information No Coverage information
3.8% 3.8% Duplication

laurentS commented 2 years ago

@aaronsteers you seem to have done quite a bit of work on this tap recently, and I got the impression that you were testing it with meltano beyond the CI checks. Hopefully I've not broken anything when merging your changes into this branch :crossed_fingers:

laurentS commented 2 years ago

@aaronsteers I added the last 2 commits because despite passing CI checks, the output of the tap broke when fed into target-postgres. I couldn't find any checks in the sdk that would check for the problems I got, so added 2 tests here. I think they might be useful in the sdk itself, if it makes sense to add them there.

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

laurentS commented 1 year ago

At this point, I think this PR is ready. We have been running the code in a staging env for a while, with no major problems. I think a breaking change note would help, something like:

This release introduces breaking changes in the state management of the tap: streams now use numeric ids to track projects and groups to better handle changes in names. If you have bookmarks from a previous version, they will become invalid, meaning the tap will restart streams "from the beginning" the first time you run on this release.

laurentS commented 1 year ago

A warning printed to stderr would be nice, additionally.

How about a warning in TapGitlab.__init__() ? Something like "This version of the tap contains a number of breaking changes (streams, schema, state...). Please read the changelog for details"?

laurentS commented 1 year ago

@aaronsteers @edgarrmondragon This PR has been hanging for a while, sorry. I've updated the sdk to the latest version, updated the tests that have now moved in there, added a note in the README, and a runtime warning (in load_state) about the breaking change in the state format. I've also updated a bunch of development related packages, poetry, mypy, black, flake8 and the github actions used here.` In my mind, this is ready to merge. :rocket: Over to you :)

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication