issues
search
AhmedZahran02
/
Search_Engine_Bolt
https://ahmedzahran02.github.io/Search_Engine_Bolt/
0
stars
3
forks
source link
Indexer
#2
Open
ahmedoshelmy
opened
1 year ago
ahmedoshelmy
commented
1 year ago
[x] TF (Term Frequency)
[x] DF (Document Frequency)
[x] IDF (Inverse Document Frequency = TF / # of documents )
[x] Stemming
[x] Stop Words( The - an , and , ... )
[x] Cleaning Page Body (Using Regex)
[x] Tokenization (Splitting sentences to words)
[ ] Adding Stop Words
[ ] multithreading