chrisbra / wikipedia2text

A commandline tool for querying the Wikipedia
Other
32 stars 4 forks source link

More accurate summary by parsing source code #6

Closed elig0n closed 6 years ago

elig0n commented 6 years ago

This is an alternative way to grab wikipedia summary which is more accurate and uses curl, grep and sed.

Explanation: It grabs the source code of a Wikipedia article using curl and keeps it in a temporary file $TMPFILE, grep the appropriate summary lines, uses sed to remove a last unimportant line (this should probably be replaced with a less cumbersome command), and format-print it as usual and ${BROWSER} -dump it. After that it removes the temporary file.

Possible TODO: Add a test for the existence of curl. Allow wget as an alternative. Split the last 'rm' part of the command so the previous if structure could be used i.e. add the colorize string after the basic command and then append the 'rm' part in each ...

chrisbra commented 6 years ago

Sorry for taking long. Had too much other stuff to do. I'll like that way so let's just use that one instead. I'll leave the TODO open for now. Perhaps somebody wants to have a look at it :)