jwcarr / mantel

Python implementation of the Mantel test, a significance test of the correlation between two distance matrices
MIT License
32 stars 19 forks source link

More than 10 items distance-lists #5

Closed oseias-r-junior closed 4 years ago

oseias-r-junior commented 4 years ago

Is there any alternative in the case one have two distances lists with more than 10 items on each (15x1 matrix, e.g.)? Right now keeps getting that X is not a valid matrix...

jwcarr commented 4 years ago

It sounds like there's something weird about the structure of your input matrix. Is it two dimensional? Maybe you need to flatten it first if it's a list of pairwise distances for six objects (= 15 distances)?

oseias-r-junior commented 4 years ago

Thanks jwcarr for answering me.

I'll try to explain a little bit what I've tried to do...

Let's use your own basic example:


dists1 = [0.2, 0.4, 0.3, 0.6, 0.9, 0.4, 0.2, 0.4, 0.3, 0.6, 0.9] # in my case are proteomic dist (in this example are your data typed 2x) dists2 = [0.3, 0.3, 0.2, 0.7, 0.8, 0.3, 0.3, 0.3, 0.2, 0.7, 0.8] # in my case are transcriptomic dist (same as above)

Mantel.test(dists1, dists2, perms=10000, method='pearson', tail='upper')


ValueError Traceback (most recent call last)

in ----> 1 Mantel.test(dists1, dists2, perms=10000, method='pearson', tail='upper') /mnt/~/MantelTest-master/Mantel.py in test(X, Y, perms, method, tail) 49 # Check that X and Y are valid distance matrices. 50 if spatial.distance.is_valid_dm(X) == False and spatial.distance.is_valid_y(X) == False: ---> 51 raise ValueError('X is not a valid condensed or redundant distance matrix') 52 if spatial.distance.is_valid_dm(Y) == False and spatial.distance.is_valid_y(Y) == False: 53 raise ValueError('Y is not a valid condensed or redundant distance matrix') ValueError: X is not a valid condensed or redundant distance matrix ___________________________________________________________________________________________________________ But if I try againg taking one item off from each list (now each distance list having 10 items): ___________________________________________________________________________________________________________ dists1 = [0.2, 0.4, 0.3, 0.6, 0.9, 0.4, 0.2, 0.4, 0.3, 0.6] # e.g. dists2 = [0.3, 0.3, 0.2, 0.7, 0.8, 0.3, 0.3, 0.3, 0.2, 0.7] # e.g. Mantel.test(dists1, dists2, perms=10000, method='pearson', tail='upper') (0.8934351557020253, 0.03333333333333333, 2.183396018240638) ___________________________________________________________________________________________________________ Just to clarify it's my first time using Mantel test.
jwcarr commented 4 years ago

Ah, I see... the first of your two examples is not valid because it contains 11 numbers, whereas the second example is valid because it contains 10 numbers. A list of pairwise distances can only have certain sizes. For example, let's say we know the locations of some cities and we measure the distance between each pair of cities:

As you can see, 11 pairwise distances is not possible.

Does that make sense?

oseias-r-junior commented 4 years ago

Thanks for your kind reply and patience jwcarr! Yeah, now it does make sense.

I need to figure it out now how to iterate through my data to generate this N 'genes': (N^2 - N) / 2 distances. But this is another story...

Thank you again!

jwcarr commented 4 years ago

P.S. Tip for doing the iteration. Assuming you have five "genes", the following code will allow you to calculate the ten possible distances:

distances = []
for i in range(5):
    for j in range(i+1, 5):
        dist = calculate_distance_somehow(gene[i], gene[j])
        distances.append(dist)