Open niranjchandrasekaran opened 1 year ago
@johnarevalo I compared the different scripts we currently have for AP calculation. I have summarized my observations and findings. I hope this will help us decide our strategy going forward. Let me know if I have gotten any details about copairs
wrong. We can then talk to Alex and Shantanu about it.
After discussing with Shantanu and Alex, we decided to use copairs for all the Morphmap analyses. John will create a notebook with an example showing to load and use copairs.
For this exercise, I used 40 plates from CPJUMP1 (2 cell types x 2 time points x (4 compound + 4 CRISPR + 2ORF plates)). I chose these plates in part because it was a good-sized data to test the scripts, and in part because
matric
had previously been run on these plates.Before running the notebooks in this repo, clone the data repo outside the root of this repo. Then run the create-parquet notebook to create a parquet file with profiles from all 40 plates.
I compared
matric
withcopairs
and two versions of my average precision scripts, one which does not vectorize the AP calculation, that was used in the current version of the CPJUMP1 paper repo and a new version that vectorizes the calculation.Comparing the results
Notebook The average precision values from all the scripts are the same.
Execution time
I don't know how long it takes
matric
to run, but the following are the execution times of the other scripts.Features of each script