Closed ScarlettZ98 closed 4 years ago
We try to implement the # and @ signs on the twittermap. In Asterixdb it will require much more work, so we decided to implement it in Elasticsearch version, which has a more powerful full-text search engine. Elasticsearch allows us to change the analyzers. A custom analyzer includes three components:
Functionalities:
Character filter : A character filter receives the original text as a stream of characters and can transform the stream by adding, removing, or changing characters.
Tokenizer: A tokenizer receives a stream of characters, breaks it up into individual tokens.
Token filters: modify tokens. Eg. lowercase filter.
Initially we tried the whitespace tokenizer that simply breaks up the word stream by whitespace. But that method loses powerful functionalities because it’s too simple. The final solution is to use the standard tokenizer but transforms “@” and “#” into phrases. This method preserves the functionalities from the standard tokenizer and allows the users to search for account and hashtags. Plus It won’t affect the original words on the screen.
Code:
curl -X PUT "localhost:9200/twitter.ds_tweet" -H 'Content-Type: application/json' -d'
{
"settings": {
"index": {
"max_result_window": 2147483647
},
"analysis": {
"analyzer": {
"default": {
"type" : "custom",
"char_filter" : ["space_hashtags"],
"tokenizer" : "standard",
"filter" : ["lowercase","stop"]
}
},
"char_filter" : {
"space_hashtags" : {
"type" : "mapping",
"mappings" : ["#=>hashtagsign","@=>usermentionsign" ]
}
}
}
}
}
'
Reference: https://www.elastic.co/blog/found-text-analysis-part-1
Not needed anymore.
Overview
To optimize the current search bar with geo_tag search. When the user inputs a keyword, and "incity:cityName", the frontend interface will only search and show twitters containing this keyword and locating in the specific city.
Plan
[x] Look for related existing files
[ ] Analyze JSONParser.scala
[ ] Analyze QueryResolver.scala
[ ] Analyze query_util.js
[ ] Modify the frontend in controllers.js
[ ] Modify backend files JSONParser.scala
Future plan
Implement more tags and syntax for the search bar. e.g. boolean operations on keywords
Reference
Scala tutorial