kymata-atlas / kymata-core

Core Kymata codebase, including statistical analysis and plotting tools
https://kymata.org
MIT License
5 stars 0 forks source link

In the computation of causality violation, how are multiple spikes for the same function treated? #361

Open caiw opened 1 week ago

caiw commented 1 week ago

Imagine a case where we have two functions, A and B, with the relationship image

but where there are three multiple distinct spikes for A, with the following order by latency: A1, A2, B, A3.

How do we connect the A-spikes to the B-spike in the IPPM graph?

Is it like this: (1) image which contributes to causality violation, or like this: (2) image or this: (3) image both of which might not?

Of course there are other options: (4) image (5) image

And dependent on this, how should CV be calculated?

I don't think we need to discuss this in detail in the paper btw, but just want to make sure we are happy with the precise definition of CV score before we finalise the results!

caiw commented 1 week ago

In my opinion, the following factors into the choice:

neukym commented 1 week ago

^ @anirudh1666

neukym commented 1 week ago

This is great. I agree with everything you say. This situation is challenging – we have no way of knowing which of these options it is. Currently (and @anirudh1666 correct me if I am wrong) we do (1), and this will correctly give a poor CV score. I think this is fine for the current paper - it is reasonable first approximation.

I have a large number of other thoughts on this, but maybe we should keep it for a conversation between the three of us otherwise I'll be writing all day. :-)

anirudh1666 commented 1 week ago

I agree with your comment.

We currently place an arrow from the final parent transform to the initial child transform. This is the most pessimistic option out of the 5. In other words, it has a low false positive rate (if you get good IPPM, very high chance it is correct) but also a high false negative rate. In choosing to optimise between these metrics, I think we should consider consequences of a false positive vs false negative. Incorrectly judging a true IPPM as false can lead to missed opportunities and papers. On the other hand, incorrectly labelling a false IPPM as true can lead to incorrect results being published. (We can talk about this in person because this is quite complicated and a bit subjective).

For the example that Cai has provided, we could use an alternative definition for CV:

This definition complies with your axioms for CV:

I definitely agree this is something we should talk about in person given its complexity.

anirudh1666 commented 1 week ago

The current definition of CV satisfies the first and third axiom but not the second. So, this is something we can talk about in the Discussion or a note I can add to the Future Work section