Open matkoniecz opened 3 years ago
ah, thanks Mateusz, good catch. Will take a look this week. cheers
oh, looks like the openstreetmap api is using a custom path - this should do the trick:
wtf.fetch('https://wiki.openstreetmap.org/wiki/Tag:highway%3Dmotorway', { path: '/w/api.php' }).then((doc) => {
console.log(doc.templates('ValueDescription'))
})
we're not recognizing the openstreetmap infoboxes, but we should - do you wanna work together on a openstreetmap plugin? That would be amazing.
yeah, and same for the 'Character' infobox template on the muppet wiki. Knowing the difference between a regular template and an infobox is really handy, but we're not doing a great job at it, clearly, for 3rd-party wikis especially. Open to any ideas.
we're not recognizing the openstreetmap infoboxes, but we should - do you wanna work together on a openstreetmap plugin? That would be amazing.
What kind of info would be needed? Names of templates used as infoboxes or something else?
Open to any ideas.
I guess that one may try to detect inbox (first template with multiple parameters), but such heuristic sounds like something that will cause more work than will save. And as demonstrated by OSM Wiki not all infoboxes have "Infobox" in its name.
Translations pages on OSM Wiki are even more evil, with infobox loaded via black magic from local Wikidata database.
oh, looks like the openstreetmap api is using a custom path - this should do the trick:
THANKS! I was really confused about what is happening.
yeah - i just went for a walk and was thinking the same thing - a heuristic about an early template, with a bunch of properties, and an 'image' or a 'name' one? That's a great idea.
I can add support for KeyDescription
, ValueDescription
, Place
, Tag
, RelatedTermList
- are there any other heavily-used ones?
and an 'image' or a 'name' one
Yeah, that should help.
I can add support for KeyDescription, ValueDescription, Place, Tag, RelatedTermList - are there any other heavily-used ones?
KeyDescription, ValueDescription Place - yes, infoboxes
Strictly speaking RelatedTermList
is not an infobox, just list of related search terms in a weird formar to help searching work properly.
Template:Tag is going to be used inline - see for example infoboxless https://wiki.openstreetmap.org/wiki/Church - and has its twin Template:Key
While I think that solution here is to add a osmwiki plugin. I also think that the fetch function deserves an refactor as well.
I see a few problems with the code there:
Firstly the function signature is a mess and all the parameters can mean all kinds of stuff. Since we a heading to a major/braking release this would be the time to solidify the parameters.
Secondly, error handling can be better. rightnow the error gets given back to the callback but with the prommis is gets caught and logged and it returns success. and there is no way to see into the request error.
Thirdly, there is no way to add random headers directly. If you need to add a header to the list you first need to make a pr to the lib.
and lastly I think that the way we make the http client cross platform is a bit of a hack. Rightnow we use rollup to switch out the server file for the client file. but if you would only read the code then you would assume that this lib was node only.
I would advocate checking the existence of window.fetch
in code and the switch of that. or even better jet use an external lib like axios for better compatibility. although i also see the drawback of that like that the size of the lib would trippel and adding the first dependency.
Hey wouter, ya I agree the fetch method deserves a refactor and i welcome the help. This is a good time. I spent a long time looking at client-server symmetric fetch libraries and I couldn't get things working properly. Yeah- as you mentioned, bundling either http lib would be bad. Im fine with swapping them out in rollup, but maybe we can reduce the dark-magic of it somehow. It would be great if the http lib followed http redirects. Is that something we could do without much overhead? Cheers
in addition to allowing users to set custom headers, per @Vacilando , we should set this as a default header - https://www.mediawiki.org/wiki/API:Etiquette#Request_limit recommends to set Accept-Encoding: gzip
I was already looking at this for a bit but christmas got in the way. I already wrote out my usual wall of text and will post it tomorrow.
As I already said I'm busy with the fetch function. but as always I have a few questions and things I want to check.
When i'm looking at usages of the wtf.fetch
on github. I see some old usage of callbacks but mostly promise based usage. I also see more use of language than options.
So my proposal is similar to the currently existing signature
/**
* fetches the page from the wiki and returns a Promise with the parsed wikitext
*
* @param {string | number | Array<number | string>} title the title, PageID, URL or an array of all three of the page(s) you want to fetch
* @param {string} [language] the language of the wiki you want to use
* @param {fetchDefaults} [options] the options for the fetch or the language of the wiki for the article
* @param {Function} [callback] the callback function for the call
* @returns {Promise<null | Document | Document[]>} either null if the pages is not found, Document if you asked for one result, and a array of Documents if you asked for multiple pages
*/
const fetch = function (title, language, options, callback) {
And with this signature there is no overlap in the parameters. And it it also preserves some of the compatibility.
I was looking at the parseUrl Function and i was wondering what the //eslint-disable-line
was covering up and it was eslint-compat.
It was trying to warn us that the URL class was only added as a global in node 10 and so we can not claim that we support node 7-9.
Easiest solution is bumping node to 10. The most difficult path would be to write our own URL parser. Middle path being importing a URL lib / polyfill.
I'm in favor of bumping to 10. I can't find any stats on node versions usage. But 10 is nearly end of life so I hope nobody will mind.
Part of the reason to refactor this part is to clear up the usage of http libraries. So I sought a library that is the same for node and browser and optimised for size.
I settled on (https://www.npmjs.com/package/isomorphic-unfetch)[isomorphic-unfetch] which is a wrapper around the build in fetch function of browsers or a polyfill for older browsers and node-fetch for node. All of these use the same api so we don't have to know on what platform we are.
Also all options for the fetch are available so also redirects.
I also build the project and the size does not change much.
name | old | new | diff |
---|---|---|---|
wtf_wikipedia.js | 268 kb | 271 kb | +3 kb |
wtf_wikipedia.mjs | 252 kb | 254 kb | +1 kb |
wtf_wikipedia-client.js | 267 kb | 270 kb | +3 kb |
wtf_wikipedia-client.js.map | 524 kb | 524 kb | +0 kb |
wtf_wikipedia-client.min.js | 124 kb | 117 kb | -7 kb |
wtf_wikipedia-client.mjs | 124 kb | 117 kb | -7 kb |
The options parameter is another part of this refactor I’d like to fix. Rightnow if there is a small problem, for example a missing header, we need to build a new version with that header in the options->header part. With the introduction of fetch I could see that people want to change more options.
So what i propose is moving all the fetch options to a ‘fetchOptions’ key and putting that right in to fetch (with some defaults). And all the other options are for us to configure our code.
This way we can also type all the properties so the auto complete of editors will find some errors.
On an unrelated note, V9 adds a lot of changes everywhere. I’d like to see a code review before you release V9 and I would be down to do it.
These were my ideas for fetch.
In my research I implemented a large part of the chagens already. But, as always, your feedback is appreciated. I will also clean the other code in the _fetch folder.
YESSSSSSSSSSSSSSSSSSSS you are the best. Thank you. plz pr to the dev branch (if that's possible)
i scaled-back some tslint stuff since your last pr - there were so many red lines. I couldn't figure out how to fix them. I could switch that back on if you'd like. I was just confused
I was working on its ticket last year. 🤣 and i did com far. i have integrated the fetch lib and changed the build script a bit so it uses the right builds for each of our builds. but no i have decision to make.
Im working on the input of the fetch function. given that the wikipedia api does not let you request pages by title and page id in the same request. how do we handle this.
and I saw another ticket about bundling requests in array do we also what to integrate this now that i'm working on this
hey wouter, thanks for asking. Yeah, I'd like to move the fancy stuff like that to the api plugin. I'm okay with returning null, or whatever response wikimedia gives us for that. we can even scale the array stuff down in this release if you'd prefer. no rush on this stuff, i've gotten distracted by other things too. I haven't touched any of the http stuff either. thanks!
I run into
SyntaxError: Unexpected token < in JSON at position 0
It seems that it may be caused by the same issue as #384 - where http error page was parsed as json.
I want to check is it actually happening but right now I have no idea how to get response code of a failing fetch.