sailuh / kaiaulu

An R package for mining software repositories
http://itm0.shidler.hawaii.edu/kaiaulu
Mozilla Public License 2.0
19 stars 12 forks source link

Bugzilla refresher depends on offset, want it to depend on datetime instead #300

Open anthonyjlau opened 6 months ago

anthonyjlau commented 6 months ago

Currently, the Bugzilla downloader uses the offset parameter to change which bugs to download on a page.

Example API parameters:

/rest/bug?created_date=2024-01-01T00:00:00Z&limit=20&offset=0
/rest/bug?created_date=2024-01-01T00:00:00Z&limit=20&offset=20
/rest/bug?created_date=2024-01-01T00:00:00Z&limit=20&offset=40
...

Each time the offset is changed, it downloads a different section of the total results. For example, if there are 70 total bugs that were created after 2024-01-01 00:00:00, the first page would contain bug 0 up to the limit, which would be bug 19. Then, the second API call would change to offset 20, which means that the first bug of the second page will be bug 20 and it would go to bug 40.

Using offset is a bit confusing so instead, it is recommended to remove offset entirely from the parameters and change the created_date parameter instead. Not only will this make it have more intuitive sense but it will also match how other downloaders (GitHub for example) downloads its data.

carlosparadis commented 6 months ago

@anthonyjlau Did you manually test the created date to see if it would work?

anthonyjlau commented 6 months ago

Yes, when I tested changing the created date manually, it works.

For example, in this API call (https://bugzilla.redhat.com/rest/bug?creation_time=2024-04-26T12:00:00Z&include_fields=_default,comments&limit=20), the creation_time starts at 2024-04-26T12:00:00Z. The last issue on the page has a creation_time of 2024-04-26T13:29:18.

So, I change the creation_time field to be 2024-04-26T13:29:19 (one more second than the last creation time) and the API call looks like this (https://bugzilla.redhat.com/rest/bug?creation_time=2024-04-26T13:29:19Z&include_fields=_default,comments&limit=20).

This will return the same thing as this API call (https://bugzilla.redhat.com/rest/bug?creation_time=2024-04-26T12:00:00Z&include_fields=_default,comments&limit=20&offset=20), which is what is being used currently.