Open anthonyharrison opened 1 year ago
Thank you for opening your first issue here :+1:. Be sure to follow the issue template if you chose one.
I'm actually working on something like that :smile_cat:
Hi @anthonyharrison, thank you for the idea.
endoflife.date is using the static site generator Jekyll. Given the static nature of endoflife.date that may be difficult to implement: JSON and HTML file are only generated when there is an update on the master branch.
Would a dataset published via a NPM package be good enough? Or a separate git repository that could fulfill the "update whenever needed" requirement easily?
I've been wanting to do this for a while, by means of uploading the generated JSON files (preferably in the v1 API format) to a release on GitHub.
But this would also require that I know all of the products in the first place
As an aside, we have an endpoint that solves this: https://endoflife.date/api/all.json.
@adriens Could you detail your plan to solve for this?
The API endpoint is a good start. Just getting a download of all of the data in JSON would be very useful. To find out what has changed since the last download could be done a number of ways. Simplest is to say if there has been any changes since a particular date in which case just download all the data again. The more elegant but slightly more complex would be to download all the changes since a particular date rather than all the data. But given the current amount of data isn't huge I imagine the first solution would be a good start.
I would rather not force the introduction of a new ecosystem (npm).
On Sat, 18 Feb 2023, 14:18 Nemo, @.***> wrote:
Would a dataset published via a NPM package be good enough? Or a separate git repository that could fulfill the "update whenever needed" requirement easily?
I've been wanting to do this for a while, by means of uploading the generated JSON files (preferably in the v1 API format) to a release on GitHub.
But this would also require that I know all of the products in the first place
As an aside, we have an endpoint that solves this: https://endoflife.date/api/all.json.
@adriens https://github.com/adriens Could you detail your plan to solve for this?
— Reply to this email directly, view it on GitHub https://github.com/endoflife-date/endoflife.date/issues/2530#issuecomment-1435686201, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAID24Q4TJCDSD74QGYSXTWYDK4ZANCNFSM6AAAAAAU7ZXDKQ . You are receiving this because you were mentioned.Message ID: @.***>
Maybe would you appreciate this repo : https://github.com/adriens/endoflife.date-nested
@adriens I can certainly use this as a starting point. However the https://endoflife.date/api/all.json already provides the data in JSON - if this was enhanced to include some more metadata e..g the date of the data dump, this would be the start of something very useful.
Hi @anthonyharrison, thank you for the idea.
endoflife.date is using the static site generator Jekyll. Given the static nature of endoflife.date that may be difficult to implement: JSON and HTML file are only generated when there is an update on the master branch.
@captn3m0 Would it not be possible to maintain a history of changes to the information contained within the _data directory and then return details of the products which have changed via an API? The API endpoints will allow me to get all of the data but they will require that I get all of the data and not just the updated?
if this was enhanced to include some more metadata e..g the date of the data dump, this would be the start of something very useful.
A timestamp containing the date of the json file would be easy to add, but it requires the v1 API format (under development, see #2080 and https://deploy-preview-2080--endoflife-date.netlify.app/docs/api/v1/ for a preview). Unfortunately the current format (v0) cannot be updated without introducing a breaking change, and we did not planned to add new endpoints.
I do not mind adding a new /v1/products/all
endpoint containing all the products with their corresponding release cycles. But that file will be big (don't know exactly how much, but at least a few MB). So I think we should consider Netlify bandwidth limits before doing that. @captn3m0, do you think it may be problematic ?
Our current bandwidth usage is around ~50GB out of our 1TB limit, so I don't see any issue there. If this ever gets problematic due to this endpoint, we can set a redirect to another host/implement caching etc easily.
However, I don't think we should be abusing our API to essentially serve a dataset. I can suggest few alternative approaches:
https://github.com/endoflife-date/endoflife.date/releases/latest/dataset.tar.gz
will always point to the latest version of the dataset, and that can be used for any programmatic usage.@anthonyharrison I'd be curious about the usecase here, to see if we can improve the API/documentation/roadmap further to account for this.
We can also add a new json endpoint called XYZ_meta.json
that will just keep the metadata for XYZ
and users can decide to fetch whole real data in a hostedcached or our normal place XYZ_data.json
So XYZ_meta.json
can only keep metadata something like
revision_id , revision_date , revision_dataurl so projects like adriens or someone else can make a check before fetching actual data
this will help them to determine before downloading same big file ( for example all_data.json ) if its revision_date is same with their own
NOTE : adding just revision_date to our current endpoints wont fix the main problem that users still need to redownload same big file if we wont implement this idea @captn3m0 @marcwrobel @anthonyharrison @adriens
NOTE : adding just revision_date to our current endpoints wont fix the main problem that users still need to redownload same big file if we wont implement this idea
@usta, is XYZ the product name ? If yes the product files are not that big (2 to 20 KB each I would say), so I think sending two requests separately could take longer than retrieving all the data in one shot.
Note that v1 product API endpoint already includes a lastModified
field, corresponding to the last time the product file was updated. Example : https://deploy-preview-2080--endoflife-date.netlify.app/api/v1/products/ansible/.
@anthonyharrison I'd be curious about the usecase here, to see if we can improve the API/documentation/roadmap further to account for this.
@captn3m0 I am trying to develop an automated audit function which will identify whether a product is under support, under extended support or EOL and trigger some workflows For products which are nearing end of supptort, I want to be able to trigger a workflow to look at the upgrade path; for those which are EOL (or nearing EOL), I would want to trigger a different workflow.
@usta, is XYZ the product name ? If yes the product files are not that big (2 to 20 KB each I would say), so I think sending two requests separately could take longer than retrieving all the data in one shot.
@marcwrobel Nope i mean all , upcomingEOL , ... endpoints
@adriens Could you detail your plan to solve for this?
@captn3m0 , I'll release a first draft in a few minutes :crossed_fingers:
@captn3m0 , here is a first proof of concept :
https://www.kaggle.com/datasets/adriensales/endoflifedate/
Please notice that :
select category,
count(*)
from product_categories
group by category
having count(*) > 10;
There are some cool surprises I'm working on too, on the same topic.
:point_up: Other files will be added : does anyone want to give a try to a ;
:star_struck:
:thought_balloon: :
I opened a PR to implement the idea in https://github.com/endoflife-date/endoflife.date/issues/2530#issuecomment-1439830897, since I liked that idea and would make use of it myself.
@adriens I see this as orthogonal to your efforts. Your work seems much more full-featured as compared to the simple GitHub Action I wrote, but I still think having a GitHub Release with a simple file is useful.
his is not a requirement for the time being, I guess because this information is not always available. Do you know where this information can be found ?
Yes, both approach are useful :+1:
Hi guys, I finally could manage to get something quite consistant, check https://www.kaggle.com/code/adriensales/endoflife-date-offline-copy/notebook
Hi @anthonyharrison here is something for you
:point_down:
Just did a first test on #2080 to export all the products with their versions on a single endpoint : https://deploy-preview-2080--endoflife-date.netlify.app/api/v1/products/full.
The generated JSON file is much smaller than what I expected : 696K uncompressed / 82K compressed (gzip).
Great, I'll give a try to prepare integrations
I really like the idea but to avoid repeated calls of the API for every product I would like data on, I would like to be maintain a local copy of the data and then only download updates each time I start my application (or after a particular time period e.g. only request updates once every 24 hours)
Ideally, I would be able to get the data in JSON format which I can then manage locally.
Alternative would be to call the API for every product to get the product data for each product. But this would also require that I know all of the products in the first place which given the dynamic nature of the data isn't very attractive.