dirtyfilthy / freshonions-torscraper

Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
GNU Affero General Public License v3.0
505 stars 148 forks source link

Searching for a "long string" like "this site is made as a joke" returns error 500 #27

Open keldnorman opened 5 years ago

keldnorman commented 5 years ago

Searching for a long string - selecting the "match phrase" returns an error - it looks like elastic search returns the error: "elasticsearch_dsl.response.hit.HitMeta object' has no attribute 'highlight'"

Any tips on how to fix this problem ?

127.0.0.1 - - [27/Sep/2018 22:25:57] "GET /?search=This+site+is+made+as+a+joke&submit=Go+%3E%3E%3E HTTP/1.1" 200 -

[2018-09-27 22:26:01,051] ERROR in app: Exception on / [GET] Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "/usr/lib/python2.7/dist-packages/flask/app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "/app/torsearch/server/lib/tor_cache.py", line 60, in my_decorator response = f(*args, kwargs) File "<auto generated wrapper of index() function>", line 2, in index File "/usr/local/lib/python2.7/dist-packages/pony/orm/core.py", line 460, in new_func try: return func(*args, *kwargs) File "/app/torsearch/server/web/app.py", line 154, in index r, n_results = helpers.render_elasticsearch(context) File "<auto generated wrapper of render_elasticsearch() function>", line 2, in render_elasticsearch File "/usr/local/lib/python2.7/dist-packages/pony/orm/core.py", line 460, in new_func try: return func(args, kwargs) File "/app/torsearch/server/lib/helpers.py", line 47, in render_elasticsearch return (render_template('index_fulltext.html', domains=domains, results=results, context=context, orig_count=orig_count, n_results=n_results, page=page, per_page=result_limit, sort=sort, is_more = is_more), orig_count) File "/usr/lib/python2.7/dist-packages/flask/templating.py", line 134, in render_template context, ctx.app) File "/usr/lib/python2.7/dist-packages/flask/templating.py", line 116, in _render rv = template.render(context) File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 989, in render return self.environment.handle_exception(exc_info, True) File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 754, in handle_exception reraise(exc_type, exc_value, tb) File "/app/torsearch/server/web/templates/index_fulltext.html", line 4, in top-level template code {% from 'search_panel.macro.html' import search_panel %} File "/app/torsearch/server/web/templates/layout.html", line 5, in top-level template code {% block body %}{% endblock %} File "/app/torsearch/server/web/templates/index_fulltext.html", line 16, in block "body" {{ domain_fulltext_table(domains, results, sortable=True, context=context) }} File "/app/torsearch/server/web/templates/domain_table.macro.html", line 161, in template {{break_long_words(hit.meta.highlight.body_stripped[0])|safe}} File "/usr/lib/python2.7/dist-packages/jinja2/environment.py", line 408, in getattr return getattr(obj, attribute) UndefinedError: 'elasticsearch_dsl.response.hit.HitMeta object' has no attribute 'highlight'

L3houx commented 5 years ago

Hi @keldnorman ! I didn't think that it was link to the length of the search input. I thought it was linked to the missing attribute highlight. Do you have the attribute highlight in your elasticsearch database ?

If you find something, let me know!

keldnorman commented 5 years ago

I dont know how i can check if I have the attribute highlight in my elasticsearch database. Under Kibana -> management -> fields i see 24 fields.

_id | string |   |   |   |   |   _index | string |   |   |   |   |   _score | number |   |   |   |   |   _source | _source |   |   |   |   |   _type | string |   |   |   |   |   body | string |   |   |   |   |   body_stripped | string |   |   |   |   |   code | number |   |   |   |   |   created_at | date |   |   |   |   |   domain_id | number |   |   |   |   |   is_banned | boolean |   |   |   |   |   is_crap | boolean |   |   |   |   |   is_fake | boolean |   |   |   |   |   is_frontpage | boolean |   |   |   |   |   is_genuine | boolean |   |   |   |   |   is_subdomain | boolean |   |   |   |   |   is_up | boolean |   |   |   |   |   last_alive | date |   |   |   |   |   nid | number |   |   |   |   |   port | number |   |   |   |   |   ssl | boolean |   |   |   |   |   title | string |   |   |   |   |   url | string |   |   |   |   |   visited_at | date

Nothing called highlight - am I on the right track here ?

L3houx commented 5 years ago

Hi @keldnorman, did you find a solution or still looking to it? If you still have this error, just look at our fork, the documentation is clear and detailed step by step. https://github.com/GoSecure/freshonions-torscraper/

keldnorman commented 5 years ago

I do still have this error - where do i look in the source to find the alteration you refer to that fixes the problem? Do you have a direct link ?

Regards Keld

Den tor. 9. maj 2019 kl. 14.37 skrev Félix Lehoux <notifications@github.com

:

Hi @keldnorman https://github.com/keldnorman, did you find a solution or still looking to it? If you still have this error, just look at our fork, the documentation is clear and detailed step by step. https://github.com/GoSecure/freshonions-torscraper/

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dirtyfilthy/freshonions-torscraper/issues/27#issuecomment-490885513, or mute the thread https://github.com/notifications/unsubscribe-auth/ACO2VPS7UXQYBMTV4HDBHLTPUQLKLANCNFSM4FXVWCRA .

-- Venlig hilsen/Best regards, Keld Norman

keld.norman@gmail.com

L3houx commented 5 years ago

I also have the error. I was thinking that it was a configuration in etc/elasticsearch that was disabled. I didn't have any idea how to solve this but I will try checking this later. If you find something let me know and you could PR on our fork to add the functionality of searching really long strings.