Open TeddyCr opened 5 years ago
I'd also consider adding a redundant tags array (Array of strings)... It may be better if we use the elasticsearch indices as source for analysis and updating of content rather than the PostgreSQL
Tags should be a keyword datatype, and title, department, recommendations, and body should be text... However, title, recommendations, and body should also have index and index_phrases set to true.
A mapping might look like
{
"mappings":
{
"properties":
{
"date":
{
"type": "date"
},
"title":
{
"type": "text",
"index": true,
"index_phrases:" true
},
"recommendations":
{
"type": "text",
"index": true,
"index_phrases:" true
},
"body":
{
"type": "text",
"index": true,
"index_phrases:" true
},
"department":
{
"type": "text"
},
"sponsors":
{
"type": "text"
},
"tags":
{
"type": "keyword"
}
}
}
}
values for tags can be inserted as an array and it will make each element a keyword
What do you think about making department
a keyword type as well? It may be better if we use the elasticsearch indices as source for analysis and updating of content rather than the PostgreSQL
do you mean fecthing agenda item on the backend directly from Elasticsearch as opposed to postgres?
I'm not sure what I mean about the analysis yet, because we don't do any for the tagging yet, but when we do, we'd like to pull from a richer source like elasticsearch.
department is an indexed text field, it probably doesn't need the reference architecture of a keyword... but you can do either.
Actually, I was rethinking the organization of the mappings (if we choose to do something like what Bonnie was suggesting with user identification of tags and some predefined users aliasing and tailoring the tags).
agenda items ingestion flow should be added to the engage-scrapper library as a separate module (esutils.py
) - as opposed to the celery task.
Context
Engage is working on developing a search functionality to improve the UI experience of its user. We'll use elastic search to enable users to narrow down agenda items to a specific topic.
Links
Dependencies
To-Dos
*backend endpoint should be added to the backend.engage.town/api/.... part of the site. The structure of the JSON data passed to the Elasticsearch API should be as follow
New to the Project?
Check out our product documentation repo.