Closed SBNoor closed 1 year ago
Thanks for reporting this issue. A possible cause of this error is that your mut
object is not a pandas DataFrame. Could that be the case? If it is a DataFrame, could you tell me what version of pandas you are using?
It is pandas version 1.1.5. And I figured it out that I am supposed to use a pandas dataframe and it works. However when I use pairwise_discover_test() for mutual exclusivity I get the following error:
And events is of type discover.data.DiscoverMatrix and subset is of type pandas.core.series.Series. Do I need to install another version of pandas?
The .ix
attribute that is mentioned in the error message was deprecated in pandas version 1.0. So a short-term fix would be to install pandas < 1.0.
A new release of the discover package is planned for early next week, which will contain a fix for this issue. It will also have some speed improvements. So even if you go for the short-term fix, I would recommend to check back next week.
Will it be possible to leave a comment here once you've updated the package?
Sure. I will leave this issue open until the new version is published.
Version 0.9.4 was released today. Among other things, it fixes the incompatibility with recent versions of pandas.
I see that newer version is supposed to be faster. I have a matrix of size 22000x11000. And I've been running the script on HPC for about 3 hours. Can you give a rough estimate about how long it would take normally?
Indeed the latest release is quite a bit faster. However, yours is an extremely large data set, so this will still need a long time to finish. I am not able to say how long it would take for your data, but you may have to think in the order of days rather than hours. There are a few things I can suggest to try and speed things up:
pairiwse_discover_test
is actually spent estimating the false discovery rates using a discrete version of the Benjamini-Hochberg procedure. Alternatively, you can choose to use the standard Benjamini-Hochberg procedure by passing the argument fdr_method="BH"
to pairwise_discover_test
. This will be much faster, but the price you pay is that the estimated false discovery rates may be higher than with the discrete procedure. The p values will be identical though.This issue has been inactive for a while now, so I am closing it. Please open a new issue if you are still experiencing problems with DISCOVER.
I've created a mut matrix using maf file. It is a binary file as stated in documentation for Python. However, when I run discover.DiscoverMatrix(mut) I get the following error:
Can you give me some insight as to what must be causing this type error? My dataframe is of shape 1367 rows × 3018 columns and looks like: