Closed TSPereira closed 4 years ago
I made the necessary changes in my local files.
Under starspace.cpp the nearestNeighbor function now looks like:
string StarSpace::nearestNeighbor(const string& line, int k) {
auto vec = getDocVector(line, " ");
auto preds = model_->findLHSLike(vec, k);
string output_preds = "";
for (auto n: preds) {
output_preds += dict_->getSymbol(n.first) + ' ' + to_string(n.second) + "\n";
// cout << dict_->getSymbol(n.first) << ' ' << n.second << endl;
}
return output_preds;
}
and then also changed the type of the function in the starspace.h header to be a string
std::string nearestNeighbor(const std::string& line, int k);
Probably there are neater ways to write that output (never programmed in C++ before), but this is good enough for me as I do the subsequent parsing in python.
Can you please share the updated python bindings? My need is to save nearest neighbors or classification copied in to a variable
If you change the files I mentioned above accordingly and rebuild starspace and python wrapper, then there aren't any changes to the bindings.
The output of these changes will be a string of format:
"<label_tag><name of label>+' '+<score>+'\n'+..."
And you can just make
x = model.nearestNeighbors(args)
I then parse the string obtained with split and pandas to get it into an organized dataframe. Example:
res = model.nearestNeighbor(query, n)
res = pd.DataFrame(res.split('\n')).iloc[:, 0].str.split(' ', expand=True)
This has become obsolete with the addition of predictTags in the python binders. Closed
Hi,
Is it possible to return the nearest neighbours to a variable instead of only printing them? This would by very useful since one might want to do something with these neighbours (my situation) and getting this info from the sys.stdout (using Python wrappers, but the issue is in starspace.cpp file) is kinda of messy.
Thanks