Open hancush opened 6 years ago
some additional ideas:
add
will update the item if it exists, i.e., it may not be necessary to delete index items, provided we can ensure consistent primary keys in the db / solr index. this second bit would require a refactor, and evz mentions that we have employed this technique in the past with councilmatic, but that sometimes the database and index get out of sync anyway, which suggests this works differently than we understand it and some additional homework needs to be completed.update_index
can do some of this ^ heavy lifting for us. we did not use haystack for this search index for a few reasons: there's inefficiency baked into its index building command (https://github.com/datamade/devops/issues/42); it uses the orm, which would be quite slow for data of this app's magnitude (especially when uploading large chunks of data, but less of a problem for amending a smaller subset of employers); and we didn't anticipate we'd need the majority of its functionality, making it a pretty heavy dependency. we've debugged the inefficiency in its indexing operation, however points two and three are still an issue.
right now, we rebuild the whole index every time new data is uploaded. this isn't really necessary. only reindex the employers and their employees if new data is uploaded for that employer.