Verbose output from bencharking pipeline seems to say 'for 25 unique sources' despite more sources

IBMB-MFP commented 6 months ago

Hi there,

Using the benchmarking pipeline I notice some behaviour which is a bit confusing and disconcerting which I thinkn may be a bug but I would like to check.

I am using decoupler 1.6.

I have a benchmarking set of around 90 experiments for around 45 unique TFs. If I run the benchmarking pipeline on this set it seems to work great. But the verbose output printed to the IPython console always seems to say 'for 25 unique sources' when there are many sources. E.g.

Using my_network network...
Extracting inputs...
Formating net...
Removed 3 experiments without sources in net.
Running **87 experiments for 25 unique sources.**
Running methods...
23806 features of mat are empty, they will be removed.
Running mlm on mat with 87 samples and 23117 targets for 482 sources.

Or:

Using my_other_network network...
Extracting inputs...
Formating net...
Removed 23 experiments without sources in net.
Running **67 experiments for 25 unique sources.**
Running methods...
9068 features of mat are empty, they will be removed.
Running mlm on mat with 67 samples and 37855 targets for 311 sources.

Note these examples are with the same benchmarking set. In the second network the set is limited to 67 experiments, rather than 87, due to the sources not being present in the second network - but the output still says 'for 25 unique sources'. This must be a bug in this output, right?

Many thanks as always for the great software and your attentive help.

Marcos

PauBadiaM commented 6 months ago

Hi @IBMB-MFP,

Good eye! I accidentally introduced a small bug in the latest release. I fixed it and now it should display the correct number of unique TFs, please install the latest version from GitHub and let me know if it works:

pip install git+https://github.com/saezlab/decoupler-py

IBMB-MFP commented 6 months ago

Using 100 network...
Extracting inputs...
Formating net...
Removed 23 experiments without sources in net.
Running 67 experiments for 33 unique sources.
Running methods...

Yes, seems to give the correct output now. Thanks!

saezlab / decoupler-py

Verbose output from bencharking pipeline seems to say 'for 25 unique sources' despite more sources #111