Wikimedia - Githubissues

tbo47 commented 7 years ago

Wikimedia has a rss/atom feed of a best picture in there catalog every day. It would be nice to use it as a feed of wallpapers! https://commons.wikimedia.org/wiki/Commons:Picture_of_the_day

ifl0w commented 7 years ago

This would be possible quiet easy but I just suggested a Generic JSON API source in #22 where you also could add your opinion if you want to. Since the Wikimedia picture would be an XML response we would have to think of another solution but it would be also possible to add a Generic XML API source.

lens0021 commented 1 year ago

Wikimedia Commons also has its own API.

What I've found

* Picture of the day is picked up from [MediaWiki:Ffeed-potd-transcludeme](https://commons.wikimedia.org/wiki/MediaWiki:Ffeed-potd-transcludeme) * It transcludes `Template:Potd/YYYY-MM-DD`, for example [Template:Potd/2023-04-10](https://commons.wikimedia.org/wiki/Template:Potd/2023-04-10) * A wikitext [`{{potd}}`](https://commons.wikimedia.org/wiki/Template:Potd) can be used for fetching potd. The wikitext behind is `[[File:{{Potd/{{#time:Y-m-d||en}}}}]]`

I tried the next sources, but failed:

Prefix: https://commons.wikimedia.org/wiki/Special:FilePath/
- $.parse.images[0] from https://commons.wikimedia.org/w/api.php?action=parse&format=json&page=MediaWiki%3AFfeed-potd-transcludeme&prop=images&formatversion=2 (See API Sandbox also)
- $.parse.images[0] from https://commons.wikimedia.org/w/api.php?action=parse&format=json&text=%5B%5BFile%3A%7B%7BPotd%2F%7B%7B%23time%3AY-m-d%7C%7Cen%7D%7D%7D%7D%5D%5D&prop=images&contentmodel=wikitext&formatversion=2

It seems $.parse.images[0] $.parse.images and $.parse are null, while$ is an object. Any ideas?

Lucki commented 1 year ago

The following works:

Request URL:

https://commons.wikimedia.org/w/api.php?action=parse&format=json&page=MediaWiki%3AFfeed-potd-transcludeme&prop=images&formatversion=2

URL prefix:

https://commons.wikimedia.org/wiki/Special:Redirect/file?wptype=file&wpvalue=

JSON Path:
```
$.parse.images[0]
```

References

`https://upload.wikimedia.org/wikipedia/commons/a/a0/Geraldine_Ulmar_in_Gilbert_and_Sullivan%27s_The_Mikado.jpg` ~~~ json { "parse": { "title": "API", "pageid": 0, "images": [ "Geraldine_Ulmar_in_Gilbert_and_Sullivan's_The_Mikado.jpg" ] } } ~~~

Edit:

I missed the date part, will update the request URL above later.
I missed that https://github.com/ifl0w/RandomWallpaperGnome3/commit/58c32707ac118344b8b01e41c7b044dbe3e35e01 wasn't backported to the soup file so in the current version the above solution will fail because of a missing user agent.

Lucki commented 1 year ago

This is actually a bit more involved.

The reason that pasting the normal URL doesn't work is that we're encoding the URI. An already encoded URI will be encoded a second time. %5B%5BFile%3A → %255B%255BFile%253A
But if you're trying to decode the URI and use that it won't work either because we're using encodeURI() which will get us a partially encoded URI.

The characters on the second line are characters that may be part of the URI syntax, and are only escaped by encodeURIComponent().

[[File: → %5B%5BFile:

https://github.com/ifl0w/RandomWallpaperGnome3/blob/02c8bfd549a87f0ecf29af917c636fd273a3b635/randomwallpaper%40iflow.space/sourceAdapter.js#L350

Possible fixes:

~~Use encodeURIComponent() instead. This may cause problems for URLs that don't want these encoded.~~
Ditch the encoding in the JSON adapter altogether and let the user put in the correct URL.

@ifl0w Any opinions?

ifl0w commented 1 year ago

@Lucki I'd not remove the encoding entirely since I'd assume that this could break existing setups. If the encodeURIComponent() solution fundamentally works, we could try that.

Another somewhat hacky approach could also be to simply call decodeURI before calling endodeURI. This should not have an effect on decoded URIs and I guess would work around this issue here?

Lucki commented 1 year ago

I'm sorry, we can't use encodeURIComponent() because that'll also encode https:// and path separators.

URL c&p from Firefox:                   https://commons.wikimedia.org/w/api.php?action=parse&format=json&text=%5B%5BFile%3A%7B%7BPotd%2F%7B%7B%23time%3AY-m-d%7C%7Cen%7D%7D%7D%7D%5D%5D&prop=images&contentmodel=wikitext&formatversion=2
decodeURI:                              https://commons.wikimedia.org/w/api.php?action=parse&format=json&text=[[File%3A{{Potd%2F{{%23time%3AY-m-d||en}}}}]]&prop=images&contentmodel=wikitext&formatversion=2
decodeURIComponent:                     https://commons.wikimedia.org/w/api.php?action=parse&format=json&text=[[File:{{Potd/{{#time:Y-m-d||en}}}}]]&prop=images&contentmodel=wikitext&formatversion=2
encodeURI(decodeURI):                   https://commons.wikimedia.org/w/api.php?action=parse&format=json&text=%5B%5BFile%253A%7B%7BPotd%252F%7B%7B%2523time%253AY-m-d%7C%7Cen%7D%7D%7D%7D%5D%5D&prop=images&contentmodel=wikitext&formatversion=2
encodeURI(decodeURIComponent):          https://commons.wikimedia.org/w/api.php?action=parse&format=json&text=%5B%5BFile:%7B%7BPotd/%7B%7B#time:Y-m-d%7C%7Cen%7D%7D%7D%7D%5D%5D&prop=images&contentmodel=wikitext&formatversion=2
encodeURIComponent(decodeURI):          https%3A%2F%2Fcommons.wikimedia.org%2Fw%2Fapi.php%3Faction%3Dparse%26format%3Djson%26text%3D%5B%5BFile%253A%7B%7BPotd%252F%7B%7B%2523time%253AY-m-d%7C%7Cen%7D%7D%7D%7D%5D%5D%26prop%3Dimages%26contentmodel%3Dwikitext%26formatversion%3D2
encodeURIComponent(decodeURIComponent): https%3A%2F%2Fcommons.wikimedia.org%2Fw%2Fapi.php%3Faction%3Dparse%26format%3Djson%26text%3D%5B%5BFile%3A%7B%7BPotd%2F%7B%7B%23time%3AY-m-d%7C%7Cen%7D%7D%7D%7D%5D%5D%26prop%3Dimages%26contentmodel%3Dwikitext%26formatversion%3D2

No matter how we encode and decode - it's never matching the correct URL copied from Firefox.

lens0021 commented 1 year ago

I do not understand the problem exactly, but an api call https://commons.wikimedia.org/w/api.php?action=parse&format=json&pageid=18176574&prop=images&formatversion=2 is also available (18176574 is the page id of MediaWiki:feed-potd-transcludeme) . Does this solve something?

Lucki commented 1 year ago

For this particular case that's a workaround, yes.

The problem is for other sites where it's not this easily possible. A general solution would be better. Also, this prevents pasting a URL from your browser address bar which is quite an inconvenience.

tbo47 commented 1 year ago

Here is an example on how to find the latest picture in plain js:

const https = require('https');
const fs = require('fs');

const resolution = process.argv.at(2) || '1280px' // 1024px 2560px

const url = 'https://commons.wikimedia.org/w/api.php?action=featuredfeed&feed=potd&feedformat=atom';

const match1 = 'href="https://commons.wikimedia.org/wiki/Special'
const match2 = 'typeof="mw:File"'

const getUrl = (url) => {
    return new Promise((resolve) => {
        https.get(url, resp => {
            let data = ''
            resp.on('data', c => data += c)
            resp.on('end', () => {
                resolve(data)
            })
        })
    })
}

const download = (url, dest) => {
    return new Promise((resolve) => {
        const file = fs.createWriteStream(dest);
        https.get(url, resp => {
            resp.pipe(file);
            file.on('finish', () => file.close(resolve))
        })
    })
}

(async () => {
    const data = await getUrl(url)
    const urls = data.split(/\n/).filter(l => l.includes(match1)).map(l => l.split(/"/)[5])
    const lastUrl = urls.slice(-1)[0]
    const data2 = await getUrl(lastUrl)
    const picUrl = data2.split(/\n/).filter(l => l.includes(match2)).map(l => l.split(/"/)[7])
    const picName = picUrl[0].slice(11)
    const picFullUrl = `https://upload.wikimedia.org/wikipedia/commons/thumb/5/50/${picName}/${resolution}-${picName}`
    download(picFullUrl, '/tmp/wikimedia.jpg')
})()

HarrySidhuz0008 commented 1 year ago

<!DOCTYPE html>

ifl0w / RandomWallpaperGnome3

Wikimedia #21