ericchiang / pup

Parsing HTML at the command line
MIT License
8.1k stars 255 forks source link

Feature idea: Add "--exec" to allow for looping through results #78

Open MaffooBristol opened 7 years ago

MaffooBristol commented 7 years ago

I wanted to use pup to mass download some files from a site, and it did a very good job of parsing the HTML. But then I had to pipe the results into a bash loop – it would be nice to have this built in with an --exec type flag, in a similar vein to how the find command works. Could have a different name, that's just my initial suggestion.

So, instead of this implementation:

curl http://www.mysite.com | pup 'a attr{href}' | while read i; do wget $i; done

You could have:

curl http://www.mysite.com | pup 'a attr{href}' --exec wget {} \;

Does that sound like a valid idea? Thanks!

ericchiang commented 7 years ago

This sounds like xargs is what you want https://linux.die.net/man/1/xargs

sbmkvp commented 1 year ago

parallel is another feature rich alternative to achieve this

curl http://www.mysite.com | pup 'a attr{href}' | parallel "wget {}"