codelibs / elasticsearch-river-web

Web Crawler for Elasticsearch
Apache License 2.0
234 stars 57 forks source link

Index objects on page instead of entire page #127

Open dutchiexl opened 7 years ago

dutchiexl commented 7 years ago

If I get it right, at the moment a page is indexed as one elasticsearch document. The properties defined are stored in that document.

What I want to achieve is: I want to scan a page for the div with class 'team' and then treat every child with the class 'member' as an elasticsearch document

Is that possible ?