AlexsLemonade / scpca-docs

User information about ScPCA processing
https://scpca.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Create illustrations for merged object downloads #218

Closed jaclyn-taroni closed 8 months ago

jaclyn-taroni commented 10 months ago

We need illustrations for SCE, AnnData, and CITE-seq. Here's the "pseudotree" from our meeting notes:

merged_object.{extension}
merged_summary_report.html
individual_reports/
└── individual_library_id/
        ├──library_id_qc.html
        └──library_id_cell_type.html

Before we start work, this issue should be updated to be more descriptive.

allyhawkins commented 9 months ago

@dvenprasad, as we discussed in slack, here are examples of the download trees.

SingleCellExperiment/R (single-cell, multiplexed, CITE)

For all projects with single-cell, multiplexed, and CITE-seq the download will be the same for SingleCellExperiment objects.

SCPCP000000_merged.rds
SCPCP000000_summary_report.html
individual_reports/
└── SCPCL000000/
        ├──SCPCL000000_qc.html
        └──SCPCL000000_cell_type_report.html

AnnData/Python (single-cell, multiplexed)

For all projects with single-cell and multiplexed, the download will be the same for AnnData objects. The only time this should be different is if a project has CITE-seq.

SCPCP000000_merged.hdf5
SCPCP000000_summary_report.html
individual_reports/
└── SCPCL000000/
        ├──SCPCL000000_qc.html
        └──SCPCL000000_cell_type_report.html

AnnData/Python (CITE)

For downloading AnnData of only projects that have CITE.

SCPCP000000_merged_rna.hdf5
SCPCP000000_merged_adt.hdf5
SCPCP000000_summary_report.html
individual_reports/
└── SCPCL000000/
        ├──SCPCL000000_qc.html
        └──SCPCL000000_cell_type_report.html

Let me know if you have any questions or if I missed anything!

dvenprasad commented 9 months ago

@allyhawkins I want to clarify if the QC reports for merged objects are named SCPCL000000_qc.html

The QC reports in the current download images are labeled SCPCL000000_qc_report.html. Do they need to change as well?

allyhawkins commented 9 months ago

@allyhawkins I want to clarify if the QC reports for merged objects are named SCPCL000000_qc.html

The QC reports in the current download images are labeled SCPCL000000_qc_report.html. Do they need to change as well?

Good catch, that's correct! The file names should be SCPCL000000_qc.html.

dvenprasad commented 9 months ago

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

I will put the corrected (qc and celltype) download folder images in another ticket. Let me know if these are good.

sjspielman commented 9 months ago

These look good to me, @dvenprasad! The only change I might actually make is renaming _summary_report.html -> _merged_summary_report.html, so it's super clear the report is only about the merged object, and not a summary of every library in the project.

@allyhawkins can you have a look too and weigh in if you agree with me on 👆 or not?

allyhawkins commented 9 months ago

These look good to me, @dvenprasad! The only change I might actually make is renaming _summary_report.html -> _merged_summary_report.html, so it's super clear the report is only about the merged object, and not a summary of every library in the project.

@allyhawkins can you have a look too and weigh in if you agree with me on 👆 or not?

I'm fine with that change. We just need to remember to change that in the workflow. Also @dvenprasad see my comment (https://github.com/AlexsLemonade/scpca-docs/issues/226#issuecomment-1885843813) about changing from SCPCL000000_celltype_report.html to SCPCL000000_celltype-report.html.

dvenprasad commented 9 months ago

Okay added merged_summary_report.html and corrected the cell type report to celltype-report

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

sjspielman commented 9 months ago

wait! @dvenprasad @allyhawkins , did we mean SCPCP000000_merged-summary-report.html?? (edit - vs SCPCP000000_merged_summary_report.html)

allyhawkins commented 9 months ago

These - and _ will be the death of me. I think if we do it for one report we need to do it for the other. Sorry @dvenprasad!

dvenprasad commented 9 months ago

Alternate suggestion: if we are struggling to keep track of the _ and the -, why don't we make it easy on ourselves and just pick one? The meaning does not change and it's only a formatting choice.

It's likely that we'll have a similar conversation in the future. We can be kind to both our present and future selves (and others who might need to do this)

allyhawkins commented 9 months ago

Alternate suggestion: if we are struggling to keep track of the _ and the -, why don't we make it easy on ourselves and just pick one? The meaning does not change and it's only a formatting choice.

@dvenprasad I totally agree that it would be easier to just have _ like we do for all the other ones. But @jashapiro had requested that we use - for within chunk spacing. See https://github.com/AlexsLemonade/scpca-nf/pull/643#discussion_r1442208454. That's what prompted this change in the first place. But if it's easier I'm good with sticking with all _.

jashapiro commented 9 months ago

Jenny Bryan did a number on me.

sjspielman commented 9 months ago

Yeah, the idea is it's more of a naming convention to use _ to separate chunks of meaning, and - for spaces within each chunk. We haven't been very consistent with this, but I don't think it's a bad idea to be consistent with it forward.

dvenprasad commented 9 months ago

finalv3_no_really_finalv4_actually_final.psd

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) See below comment for correct image

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

allyhawkins commented 9 months ago

🎉

sjspielman commented 9 months ago

One more round I think 😬...

I think for AnnData/Python (single-cell, multiplexed), we still need the _rna for the merged. So, SCPCP000000_merged.hdf5 -> SCPCP000000_merged_rna.hdf5

https://github.com/AlexsLemonade/scpca-nf/blob/f08ce452f95f88e169217c114aef1c1321a3688c/merge.nf#L97-L105

dvenprasad commented 9 months ago

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

sjspielman commented 9 months ago

🐻‍❄️ i hear we like animal emojis around here

dvenprasad commented 9 months ago

All right here is merged object with single_cell_metadata.tsv and bulk files.

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

sjspielman commented 9 months ago

@dvenprasad I think we've made it!!

In the interest of making sure, @allyhawkins can you look too?

allyhawkins commented 9 months ago

These look good to me! Thank you for dealing with all of our changes @dvenprasad

sjspielman commented 8 months ago

final.final closed by #256