broadinstitute / 2022_PERISCOPE

This repository contains all supporting analyses and files for Ramezani, Bauman, Singh, and Weisbart, et al. "A genome-wide atlas of human cell morphology".
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Figure 3 update, extended figure 8 addition #22

Closed MerajRamezani closed 1 year ago

MerajRamezani commented 1 year ago

@ErinWeisbart can you check out these update? We need to update the folder/notebook name(numbers) after this.

ErinWeisbart commented 1 year ago

@MerajRamezani A couple concerns:

1) There is now a deprecation warning that wasn't there previously in Figure 3 notebook.

The environment /opt/anaconda3/envs/periscope_2022_updated/lib/python3.9/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()

What is different about your periscope_2022_updated environment that triggered this warning when compared to the environment.yml that is in the repo? This PR doesn't include any changes to the environment?

2) In Figure 3 you've added in a PCA before H/I. This is duplicated in Supplemental 8 so the file needs to be saved out and read into Supplemental 8 instead.

3) In Figure 3, markdown header Figures 3G and 3H needs to be changed to Figures 3H and 3I

MerajRamezani commented 1 year ago

@ErinWeisbart Some ideas: 1- I haven't added anything new to the .yml file but it seems that the problem is caused by the numba version which we haven't fixed in the yml file and was updated on my system: from 0.56.3 to 0.57.0 . I can upload a version without error or we can update the yml file (or both!).

2- The reason for duplication is these are similar heatmap types the we pushed ti the supplement because we did not have enough space. Since the PCA is an important part of the analysis don't you think we should keep it this way in case someone might run them separately? For example if someone wants to run the supplemental 8 without figure 3 first.

3- I can update that

ErinWeisbart commented 1 year ago
  1. Let's fix the number version in the yml to what it was when we ran through all the original analyses
  2. No, we should only have the code once. If we're generating the PCA here then we should save the PCA outputs and load those outputs into Supp 8. They will be able to run Supp 8 without figure 3 first.
MerajRamezani commented 1 year ago

@ErinWeisbart I tried to address all the concerns. Let me know what you think.

MerajRamezani commented 1 year ago

@ErinWeisbart Could you merge the pull request if everything seems to be in order?

ErinWeisbart commented 1 year ago

Convention throughout the rest of the repo is that any outputs from a notebook are saved into its "outputs" folder so I moved the PCA files back to Fig3/outputs and fixed the import path in Supp8. Otherwise its fine to merge.