Open pombredanne opened 8 years ago
@pombredanne Sorry, I missed your comment. Could you please elaborate a bit? Why would you need to store IDs for your strings? The operation supported so far by this implementation is the computation of the score of relevance of a keyphrase to the AST built for a set of strings. So this relevance score is computed with respect to the whole set of strings (texts); you cannot compute the relevance of your keyphrase to some particular string in the set unless you build an AST for it separately.
Please note that I am going to re-work this code a bit in the neareast future and probably add some new functionality as well. So you are welcome to make any feature requests if you have them!
@mikhaildubov I was mostly interested in your generalized suffix tree construction to evaluate that for multiple pattern search. And not so much by the scoring for now. Now the id of a string can be seen as the unique terminator added to each string https://github.com/mikhaildubov/AST-text-analysis/blob/2b8eff7e430f32fd87408401012eb315b767a2ba/east/asts/utils.py#L25
@mikhaildubov I am just starting to play with this code. How would I be able to store some ID for a string in the AST?