niranjchandrasekaran / compare_ap

Compare AP values calculated by the different scripts
0 stars 0 forks source link

Compare the different average precision calculation scripts #2

Open niranjchandrasekaran opened 1 year ago

niranjchandrasekaran commented 1 year ago

For this exercise, I used 40 plates from CPJUMP1 (2 cell types x 2 time points x (4 compound + 4 CRISPR + 2ORF plates)). I chose these plates in part because it was a good-sized data to test the scripts, and in part because matric had previously been run on these plates.

Before running the notebooks in this repo, clone the data repo outside the root of this repo. Then run the create-parquet notebook to create a parquet file with profiles from all 40 plates.

I compared matric with copairs and two versions of my average precision scripts, one which does not vectorize the AP calculation, that was used in the current version of the CPJUMP1 paper repo and a new version that vectorizes the calculation.

Comparing the results

Notebook The average precision values from all the scripts are the same.

Execution time

I don't know how long it takes matric to run, but the following are the execution times of the other scripts.

Script Execution time
Matric not available
Copairs ~1 min
vectorized ~2.5 min
non-vectorized ~21.5 min

Features of each script

Features Matric Copairs My vectorized script
Language R Python Python
Speed NA Very fast fast
Multiple matching columns Yes Requires some preprocessing Yes
Multi label Yes Yes Yes
Average precision calculation Yes Yes Yes
p value Yes Yes Yes
Adjusted average precision Yes No Yes
Available as a package? Yes Yes No
Optimized for speed? Yes Yes No
Ease of use for python users in the lab Difficult Easy Easy
Has it been extensively tested? Yes No No
niranjchandrasekaran commented 1 year ago

@johnarevalo I compared the different scripts we currently have for AP calculation. I have summarized my observations and findings. I hope this will help us decide our strategy going forward. Let me know if I have gotten any details about copairs wrong. We can then talk to Alex and Shantanu about it.

niranjchandrasekaran commented 1 year ago

After discussing with Shantanu and Alex, we decided to use copairs for all the Morphmap analyses. John will create a notebook with an example showing to load and use copairs.