Open ghost opened 11 years ago
Hey @thangalin, I'm unsure of the business case for this feature. What sort of situation would you need it in?
Maybe a more flexible solution would be for me to make it easier for you to include Element Finder as a Node module inside another JavaScript file. Your custom file could then define its own output format.
Mostly for screen scraping. I've since found the W3C tools that allows me to accomplish this task: http://www.w3.org/Tools/HTML-XML-utils/
For example:
wget http://website.com/ | hxnormalize -l 240 -x 2>/dev/null | hxselect -s '\n' -c "label.black" | sort | uniq > content.txt
Contrasted with:
wget http://website.com/ | elfinder -s "label.black" | sort | uniq > content.txt
I could then easily import the elements into a database. But this probably isn't what you intended for the tool. Plus, a solution already exists, and I can easily wrap the hxnormalize and hxselect tools in a shell script to get:
wget http://website.com/ | cssgrep "label.black" | sort | uniq > content.txt
Your tool came close, but it's just not usable with the other Unix tools, which limits its usefulness for generic scraping and parsing of web pages within a shell (e.g., bash).
Hey Dave,
I quite like the idea of making the output of Element Finder easier to pipe into other command line tools.
I will do some research into the standard practises for input/output of Unix tools and think about how this would best apply to Element Finder. I think it would make sense for Element Finder to have a similar interface to grep, because they are both searching through files for matches to a pattern: grep with a regular expression and Element Finder with a CSS selector.
I actually haven’t considered using Element Finder for scraping before. Its primarily designed for use during web development when you want to check which, if any, of your files contain a match for a CSS selector. But I can see scraping is a natural extension of that.
Cheers
Hi, Keegan.
If you read from standard input and write to standard output that will allow the tool to be piped with all other Unix tools. Any errors (or logging) should be written to standard error.
👍 on the idea of piping to other commands.
For example, vi `elfinder -s .some-class`
would be reaaaaaall handy :)
Overview
Would be great to use Element Finder to select the content of an HTML element based on its CSS path.
Example
For example, consider the following HTML document:
I'd like to execute the following
This would write the following to standard output: