AlexsLemonade / OpenScPCA-analysis

An open, collaborative project to analyze data from the Single-cell Pediatric Cancer Atlas (ScPCA) Portal
Other
9 stars 17 forks source link

02 run azimuth before kweight #737

Closed maud-p closed 2 months ago

maud-p commented 2 months ago

Purpose/implementation Section

Please link to the GitHub issue that this pull request addresses.

Copying the PR#706 https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/706 to:

• roll back to what we had in https://github.com/AlexsLemonade/OpenScPCA-analysis/commit/f98da21d6f936791da10ca2d903e9661d9f409d0? (i.e., before all the k.weight debugging, etc.) • use system() instead of source() for download-and-create-fetal-kidney-ref.R in 00_run_workflow.R

703

What is the goal of this pull request?

I write a RMarkdown script to

  1. download the data from the fetal kidney atlas
  2. process it and create an azimuth compatible reference,

This will be used in the module to perform label transfer from the human fetal kidney atlas to the Wilms tumor samples.

Briefly describe the general approach you took to achieve this goal.

I used a different approach than described in the issue #703 . I figured out that I can download the human fetal kidney data from cellxgene as a rds object. url = "https://datasets.cellxgene.cziscience.com/40ebb8e4-1a25-4a33-b8ff-02d1156e4e9b.rds"

Like this, I didn't need to create a conda/renv enrironment. The dockerfile from this PR is the same as the dockerfile from the PR #704 .

Of note however, using your documentation, I have been able to build a conda/renv container that actually allow to run the scripts I did so far, in case we need it in the future :)

If known, do you anticipate filing additional pull requests to complete this analysis module?

Yes! The next one will be to implement the label transfer in the sample report.

Results

What is the name of your results bucket on S3?

here is the command I used to upload the result to the bucket:

scripts/sync-results.py cell-type-wilms-tumor-06 \
    --bucket researcher-008971640512-us-east-2 \

What types of results does your code produce (e.g., table, figure)?

The azimuth compatible reference in a format of 2 files:

When running the 00_fetal_reference_kidney.Rmd, these two files will be saved in the module folder/marker-genes.

What is your summary of the results?

We build a reference that is compatible with the azimuth label transfer.

Provide directions for reviewers

I have one issue, using the RMarkdown 00_fetal_reference_kidney.Rmd, we can only build the reference manually running each chunk, but it do not work when we want to knittr as html report. I could isolate the problem to the AzimuthReference function, might be related to this issue: https://github.com/satijalab/azimuth/issues/219

What are the software and computational requirements needed to be able to run the code in this PR?

  1. Run the docker container,
  2. open RStudio from http://localhost:8787/
  3. in the module folder open the 00_fetal_reference_kidney.Rmd and run chunk by chunk (do not knittr!!)

Are there particularly areas you'd like reviewers to have a close look at?

Is there anything that you want to discuss further?

Author checklists

Check all those that apply. Note that you may find it easier to check off these items after the pull request is actually filed.

Analysis module and review

Reproducibility checklist

maud-p commented 2 months ago

Thank you very much @jaclyn-taroni ! I'll do the last changes on the Read.me file right now and run the analysis over the evening/night. If everything fine, I should commit the 40 reports by tomorrow morning :) (European time, Vienna).

Thank you so much, looking forward the next steps :D

maud-p commented 2 months ago

@jaclyn-taroni FYI, the extraneous RDS files for samples 168-178 in my results bucket should have been removed using --destructive-sync !

jaclyn-taroni commented 2 months ago

https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/737/commits/5afefe3e96f8e06a49c839f2d9729af93aa1c7e2 is just the HTML output of the notebooks after the change in https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/737/commits/7e877edce5323c5c7247a9088712c13ac577868f, which passed CI: https://github.com/AlexsLemonade/OpenScPCA-analysis/actions/runs/10656117562/job/29534402700

Merging without the check on running the module.

maud-p commented 2 months ago

Thank you so much @jaclyn-taroni ! It is great to see the merge in the main branch :) I'll create a new branch from here and continue with PR #704 ! Thanks!!