Open Aazhar opened 8 years ago
Indexer returns data by buckets, so we might not have all the coauthors, btw same apply to keywords and interests..
The impact depends a bit on the task, we can get the n top co-authors but indeed not all of them (even with the latest ElasticSearch aggregations). The question is thus do we need to give all of the co-authors or simply the n top co-authors - the latter can make more sense in an analytics task I think.
Co-Authors should be extracted before indexing (handle duplicates using common distances)