internetarchive / iari

Import workflows for the Wikipedia Citations Database
GNU General Public License v3.0
11 stars 9 forks source link

Have /references endpoint accept article URL #819

Closed harej closed 1 year ago

harej commented 1 year ago

Like how I can request /article data for a URL:

https://archive.org/services/context/iari/v2/statistics/article?url=https://en.wikipedia.org/wiki/Austria&regex=test

I would like to be able to do the same thing with the /references endpoint, for example:

https://archive.org/services/context/iari/v2/statistics/references?url=https://en.wikipedia.org/wiki/Austria&all=true

This would be useful if I am working with a list of page titles and not page IDs.

dpriskorn commented 1 year ago

Once https://github.com/internetarchive/iari/pull/826 is done references/ will support serving the revisions of articles

dpriskorn commented 1 year ago

I'm guessing you want this to "save" a request, right? In that case this is a very special case I would say. This endpoint does no fetching from the internet currently. It is very simple and just reads from the cache on disk.

harej commented 1 year ago

Retrieving data by URLs is not a special case, it is the use case supported by the other endpoints. I want to be able to go from article URL to wikitext references. I don't care what the implementation is. I just want it to happen.

harej commented 1 year ago

If this endpoint is only meant to read data that is already cached, then we should revisit the idea of adding original wikitext to the /article endpoint.

dpriskorn commented 1 year ago

I can do that very easily 😀 I'll keep the references lightweight by removing only the url_objects and the template_objects

dpriskorn commented 1 year ago

Closing in favor of https://github.com/internetarchive/iari/issues/831