Closed arni077 closed 5 years ago
how this algo know to infer from the hash the contents of the website? if someone search "sport" then how the code know that the website contain the word "sport" ?
Check out Apache's Tika. It downloads the file and processes it with Tika, which yields the contents which are then indexed by Elasticsearch.
how this algo know to infer from the hash the contents of the website? if someone search "sport" then how the code know that the website contain the word "sport" ?