mhorlbeck / ScreenProcessing

58 stars 31 forks source link

Issue with calculate_mw in process_experiments.py #21

Closed a-brazel closed 1 year ago

a-brazel commented 1 year ago

Hi,

When using a Mac terminal and trying to use the process_experiments.py script I keep getting an error which prevents correct p-value calculation. I have tried to process my own data (data that previously worked and new data) and I now get the below error message.

Computing gene scores
  --calculate_ave
  --calculate_mw
  Traceback (most recent call last):
    File "process_experiments.py", line 555, in <module>
    processExperimentsFromConfig(args.Config_File, args.Library_File_Directory, args.plot_extension.lower())
  File "process_experiments.py", line 257, in processExperimentsFromConfig
  for (phenotype, replicate), gtable in geneTableCollapsed.groupby(level=[0,1], axis=1):
    UnboundLocalError: local variable 'geneTableCollapsed' referenced before assignment

I am able to generate gene tables but many of the mw_pvalues are over 1.

When I run the demo data the script freezes at

"Computing gene scores
--calculate_ave
--calculate_mw"

Any idea how to fix this?

Thanks

mhorlbeck commented 1 year ago

it looks like the new version of scipy adds a mann-whitney p-value parameter that defaults to "auto", which sometimes uses the "exact" method of calculation which is very slow and gives different results. set this to now always use "asymptotic". this is still quite slow and with some random numpy arrays was 3x slower than older versions of mann-whitney, so if you can I'd use 1.6.x