Open tobias opened 3 weeks ago
Would the above work for you @cursive-ide? This is I think the bare minimum, so I'm happy to discuss adding more data to the feed.
Yes, I think that would work well. I'm a little confused by the pagination - I pass a from
parameter, which will then get me releases up to 30 days after that date. But will the next
field then return releases after the first 30? So the idea is that I would start from the oldest date and then iterate forward until there are none left?
Also, it might be a good idea to have a flag to only include non-SNAPSHOT versions? I'm not sure about this, I'm not sure whether I'd want to index snapshots or not - I'll think about this.
@cursive-ide:
I'm a little confused by the pagination - I pass a from parameter, which will then get me releases up to 30 days after that date. But will the next field then return releases after the first 30? So the idea is that I would start from the oldest date and then iterate forward until there are none left?
Yes, correct. You would pass from=date1
, and would get 30 days worth of releases. The next
url in the response would have from=date2
, where date2 would be the earlier of:
You could then page until you got an empty array, and the next
url is where you could start next time.
However, I realize that that won't account for a 30 day period where there are no releases (I suspect we have gaps like that in the early days), as we will return an empty array for those gaps, which will appear to be the end of the stream. So we need another way to signal "there are no more pages".
An alternate approach is we don't give you 30 days of releases, but instead send up to n
releases (100?). Then there will never be an empty page.
So then the from
param in the next
url would be either:
released-at
value from the last release on the page (if there are release items returned)from
given in the request (if there are no release items to return)Also, it might be a good idea to have a flag to only include non-SNAPSHOT versions? I'm not sure about this, I'm not sure whether I'd want to index snapshots or not - I'll think about this.
I'll start w/o this unless you say you need it; it would be simple to add later.
For context: if we returned 100 results/page, it would take 3045 pages to iterate through all of the releases throughout history. I think 500 results/page would also be fine from a performance or load perspective, which would mean only 608 pages to get all releases.
We've had a request for a feed of releases.
I think we could do this via an API endpoint. Something like:
GET https://clojars.org/api/release-feed?from=2012-03-01T21:38:31.525Z
The required
from
param is a timestamp where the feed starts (releases after that timestamp will be returned). The response will be json, and include up to 30 days of releases, and include a link to get the next page/batch:The end of the feed would be signaled by an empty
releases
array, and thefrom
value in thenext
property will be thereleased-at
of the most recent release (though that can likely be considered just an implementation detail):Each non-SNAPSHOT version should appear only once in the feed, but SNAPSHOT versions could appear multiple times; ~they will appear for the latest version, but if a release occurs while you are paging the results, the SNAPSHOT will appear again. This is due to how we track versions in the db; SNAPSHOTs have a single entry in the table that is updated on release instead of a new one added (IIRC).~ That is incorrect; we store an entry per SNAPSHOT release, so they will appear in the feed at a position that matches each time it was released.