ericchiang / pup

Parsing HTML at the command line
MIT License
8.11k stars 256 forks source link

Feature Request: strip tags that match a selector #74

Open hughrawlinson opened 7 years ago

hughrawlinson commented 7 years ago

Similar to grep's -v (inverse grep), I'd love an 'inverse select' option for pup. Something like:

$ echo '<div><h1>Yes</h1><p>No</p></div>' | pup -v 'p'
<div><h1>Yes</h1></div>
ericchiang commented 7 years ago

Yea this would be a welcome addition.

nick-bull commented 5 years ago

Progress on this? Best I could come up with in the meantime:

html="<h1>hi</h1><h2>there</h2>"
diff --unchanged-group-format="" \
  <(echo "$html" - | pup --indent 0 'h2') \
  <(echo "$html" - | pup --indent 0 '*')` | pup

--indent 0 is redundant in this example, but in multiline html changing indents will create no match, so I guess you could say... it makes a diff (sorry). The final pup pipe will re-indent the html for ya

wolfgang42 commented 5 years ago

@nick-bull #81 implements this. I see that Eric has just started making commits again, hopefully he'll merge the open PRs soon and make a new release. In the meantime pulling and building my branch should get you what you need.

frioux commented 2 years ago

I merged this into https://github.com/frioux/pup, fyi