lingpy / linse

A Python library for the manipulation of linguistic sequences.
Apache License 2.0
1 stars 1 forks source link

transform functionalities #3

Closed LinguList closed 4 years ago

LinguList commented 4 years ago

transform or manipulate makes another sequence out of a given sequence

And maybe some of the ngram functions, but they are also rather specific, I think.

xrotwang commented 4 years ago

Regarding ngrams, I'm not sure this is needed considering that it's rather short to implement:

def ngrams(l):
    for i in reversed(range(len(l))):
        for j in range(len(l) - i):
            yield l[j:j+i+1]

> list(ngrams(list('abcdefg')))
[['a', 'b', 'c', 'd', 'e', 'f', 'g'], ['a', 'b', 'c', 'd', 'e', 'f'], ['b', 'c', 'd', 'e', 'f', 'g'], ['a', 'b', 'c', 'd', 'e'], ['b', 'c', 'd', 'e', 'f'], ['c', 'd', 'e', 'f', 'g'], ['a', 'b', 'c', 'd'], ['b', 'c', 'd', 'e'], ['c', 'd', 'e', 'f'], ['d', 'e', 'f', 'g'], ['a', 'b', 'c'], ['b', 'c', 'd'], ['c', 'd', 'e'], ['d', 'e', 'f'], ['e', 'f', 'g'], ['a', 'b'], ['b', 'c'], ['c', 'd'], ['d', 'e'], ['e', 'f'], ['f', 'g'], ['a'], ['b'], ['c'], ['d'], ['e'], ['f'], ['g']]

get_all_posngrams seems a lot more powerful. So I'd rather just not add such a function here.

LinguList commented 4 years ago

Just thought about ngram functions. They are basically all easy to implement, also bi, trigrams, and the like. And they are not necessarily needed by now, it would rather be handy to have them in some place, for developing new experiments and algortithms. If needed, one could add ngram functions in a specific ngram module of linse, I think, since they are a specific way of manipulation that one recognizes as something specific.

LinguList commented 4 years ago

So in my opinion, we can drop this for the time being and mark this closed.