dijs / wiki

Wikipedia Interface for Node.js
MIT License
315 stars 61 forks source link

Fetch smaller images for mainImage()? #174

Closed dmccallie closed 1 year ago

dmccallie commented 1 year ago

Hello,

This is a very nice library. Thank you for creating it.

I occasionally need to fetch the page=>mainImage() URL, and it works very well.

However, the image URL that is returned is apparently the original image, and in some cases, is very large (65MB in a recent test.)

I don't need to grab the large images, so my question - is there any way to programmatically find and fetch one of the smaller versions of the mainImage()?

For example, on the web page of the Wikipedia, the images that are shown are considerably downsized from the original. Is there any programmatic way to fetch one of these smaller versions? I can see that these smaller image URLs are mutated by encoding the size parameter in a new URL, but it does not appear to be a simple mutation of the mainImage() URL?

Also, for some reason, when I fetch the mainImage() (using an <img src=xxx> element in a web page) the browser is not caching the very large image. Is there some way (like a cache control header?) for me to indicate that it's OK to be cached? (Pardon my ignorance here!)

Thanks!

dmccallie commented 1 year ago

For what it's worth, if others run into this issue - I found a sort-of work around.

See the info here https://api.wikimedia.org/wiki/API_reference/Core/Media_files/Get_file for examples of how to ask wikimedia servers for the list of available versions of the image (by size and type) when given a particular image FILE name.

The gist is that after you extract the main image name using:

let image_name = await wiki().page('page_name').then(page => page.mainImage())

then pass image_name to the function detailed in the above link. That query returns some JSON that describes the full set of versions for the given image_name. You can pick which one fits your use-case, and then invoke that URL in your <img> element.

There are a bunch of edge cases to take care of, but hopefully you get the gist...

Thanks.