Closed bukosabino closed 5 years ago
Hi @bukosabino, in the case you want to see the average LCP, you should run the check procedure -c
: ./egsa dataset/input-100.txt 5 -c
.
Also, it is not computing the Longest Common Substring (LCS), this is the average value in the Longest Common Prefix (LCP) array.
So, do you think this library resolve other problem that I need?
What relation have this library with this paper: https://link.springer.com/article/10.1007/s00453-009-9369-1?
Best,
Yes, I think so. It only computes the data structures used by this paper to compute LCSs. In the case you want to compare strings (using another distance measure), I have implemented this tool: https://github.com/felipelouza/bwsd Best!
Cool library. But, I don't need Burrows-Wheeler measure at this moment.
I need to calculate k-LCS in a big string collection. I use this library: https://github.com/ptrus/suffix-trees. But, I have some performance problems because a have a lot of strings and with big size :(
This is the reason I read this paper https://link.springer.com/article/10.1007/s00453-009-9369-1 and I find your repo. What do you recommend me?
Good job sharing code Felipe!
I see.
Here you can find the implementation for the paper you have mentioned: https://www.uni-ulm.de/in/theo/research/seqana.html Also, I know this related repository: https://github.com/giovannarosone/cLCP-mACS
Best!
Hi @felipelouza ,
Finally, I have run the library on a Linux machine :)
I am not sure if I interpret in the right way the normal output of this library, because I get a bigger LCS size with k=50 than with k=5. What is the meaning of the "size" in the output?
k=5
k=50
My problem is about to find the k-LCS in n (n>=k and 2<=k<=n) strings. So, when k=5 the LCS value should be >= than when k=50.