spencermountain / wtf_wikipedia

a pretty-committed wikipedia markup parser
https://observablehq.com/@spencermountain/wtf_wikipedia
MIT License
779 stars 129 forks source link

Fetching an array fails if there's a full URL #416

Open Vacilando opened 3 years ago

Vacilando commented 3 years ago

Consider wtf.fetch(wikiArticles)

For wikiArticles = https://en.wikipedia.org/wiki/Cosmology' the Cosmology page is fetched.

For

wikiArticles = [
  'Richard P. Feynman',
  'Thor Heyerdahl',
  'https://en.wikipedia.org/wiki/Cosmology',
]

only the Richard Feynman and Thor Heyerdahl pages get fetched.

For

wikiArticles = [
  'https://en.wikipedia.org/wiki/Cosmology',
  'Richard P. Feynman',
  'Thor Heyerdahl',
]

nothing gets fetched.

spencermountain commented 3 years ago

hey @Vacilando yeah, sorry about that. That's pretty gross. I'm working on a wtf-api-plugin on the dev branch that may help with your work. It will rate-limit large arrays, and should improve this sort of workflow. May be a few weeks out still. What are you working on? cheers

Vacilando commented 3 years ago

I'll use titles instead of URLs for the time being. But it's a serious bug indeed; it will help us and certainly many others when this gets fixed. Thanks a million for looking into that!