ellmetha / django-machina

A Django forum engine for building powerful community driven websites.
https://django-machina.readthedocs.org
BSD 3-Clause "New" or "Revised" License
594 stars 126 forks source link

Suggestion: reduce dependency - full text-search / django-haystack #178

Open rainulf opened 4 years ago

rainulf commented 4 years ago

First of all, thanks for sharing such an awesome project! It's really great.

Regarding this issue / suggestion - it would be great if we can reduce dependency such as django-haystack. It's another great project, but it appears to be a bit outdated, especially their ElasticsearchSearchEngine implementation. I'm also getting the following warning with SimpleEngine (update_index ... even though it says it has indexed it):

haystack/backends/simple_backend.py:41: UserWarning: update is not implemented in this backend
  warn('update is not implemented in this backend')

... which is related to https://github.com/ellmetha/django-machina/issues/85#issuecomment-426107733 ... but shows no result on the search.

For simple / small forums, django __search or __icontains is probably good enough? In any case, I will try to see if I can take a stab at it.

ellmetha commented 4 years ago

Yup. I had this idea in mind for a while now ; django-machina would have its own search backends mechanism. But this would require a lot of work (and a major breaking change).

I'm open to suggestions regarding this!

savelmtr commented 4 years ago

I've tried to run the search on my tiny forum but failed. Whoosh strikes memory error, Solr rejects even to update index and elasticsearch just refuses to work. Simple engine return nothing. So, we have to admit that in reality django-machina doesn't have search.

I think it may be more convenient to use django.contrib.postgres for search purposes and maybe __search as @rainulf just wrote.

If you explain how search made in django-machina, I will try to repair it. Now I'm a bit confused.

ellmetha commented 4 years ago

@savelmtr while I agree that django-haystack probably became "unmaintained" and that this is a concern that should be taken into account regarding django-machina, I don't think it's fair (nor constructive) to state that "in reality django-machina doesn't have search".

If you look at the demo app, which uses the Whoosh backend, you'll notice that search works as expected.

I'm not saying that django-haystack doesn't have problems that need to be addressed, but in the current state of things django-machina relies on Haystack for its search capabilities. Changing this will take some time as you can imagine. So in the meantime if you need help with a short term solution regarding django-haystack (to make it work! 😃), the best I can do is to point you to the django-haystack project itself.

ellmetha commented 4 years ago

To add more precision regarding this, I plan to completely rewrite the forum_search application and introduce search backend mechanism that would be built in django-machina. At first we could image releasing something with only two backends: a default one working with all database backends, and another one for PostgreSQL. In the future other backends could be added depending on the needs (elasticsearh, etc). 😉

savelmtr commented 4 years ago

Regarding django-haystask - it's not the fault of it or django-machina. Problem is in the fact that owner of the forum can't afford virtual machine with more than 1 Gb RAM. Whoosh can provide search through forum but it is eating memory. The forum I've made has more than 200000 posts so whoosh get memory error. I suppose that if this site has about 30000 posts whoosh would work. (I can't say the same thing about Solr, because backend of it in haystack doesn't split huge posts - so even the indexation failed). You may notice that problems begins only if we have forum with a lot of posts and with very long posts.

But, as I see, whoosh project is a bit abandoned. Also I'm a newbie in Python and can't fix its memory leak. I don't know, may be I have to add to haystack a new backend to connect with django.contrib.postgres - just for convenience?

Your plans about search are very cool. May be one day them will be reality ;) If I could help with them, I'll glad to be useful.

savelmtr commented 4 years ago

Okey, @ellmetha, I've done it! You may check the PR #187 I think, the description of this PR should be placed in some way to docs.

savelmtr commented 4 years ago

@ellmetha, I don't know, how to make tests. If it is the reason of your silence, may be you could help me in this?

simonjoeca commented 4 years ago

I'm trying both django-haystack’s simple backend and Whoosh, but none returns anything from search.

Did run update_index as below in both cases:

python manage.py update_index
Updating backend: default
Backend 'default' doesn't require rebuilding

Is there something I miss? Can someone shine some light on this please?

simonjoeca commented 4 years ago

Figured it out. Turned out haystack needs to come before (at least in Django 1.9~2.2) wagtail.* in INSTALLED_APPS, otherwise the same named command in Wagtail is called instead of the Haystack one.

BoPeng commented 3 years ago

I landed on this thread while looking at the whole machina/haystack issue. It is clear to me that haystack is no longer maintained and should be removed from machina.

The way to move forward, may I suggest, is to develop separate search functions for each search engine. For example, a new machina-search-elasticsearch app that uses django-elasticsearch-dsl could be added for elastic search. Support for other search engines can be added in a similar fashion.

I believe that this is the easiest way to avoid the entire haystack dilemma, namely trying to come up with a "common ground" or "abstract layer" for all search engines (which is a revamped machina-search app for the case of machina) but ends up not supporting any of them well.