AlexsLemonade / scpca-docs

User information about ScPCA processing
https://scpca.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Create illustrations for merged object downloads #218

Closed jaclyn-taroni closed 9 months ago

jaclyn-taroni commented 11 months ago

We need illustrations for SCE, AnnData, and CITE-seq. Here's the "pseudotree" from our meeting notes:

merged_object.{extension}
merged_summary_report.html
individual_reports/
└── individual_library_id/
        ├──library_id_qc.html
        └──library_id_cell_type.html

Before we start work, this issue should be updated to be more descriptive.

allyhawkins commented 10 months ago

@dvenprasad, as we discussed in slack, here are examples of the download trees.

SingleCellExperiment/R (single-cell, multiplexed, CITE)

For all projects with single-cell, multiplexed, and CITE-seq the download will be the same for SingleCellExperiment objects.

SCPCP000000_merged.rds
SCPCP000000_summary_report.html
individual_reports/
└── SCPCL000000/
        ├──SCPCL000000_qc.html
        └──SCPCL000000_cell_type_report.html

AnnData/Python (single-cell, multiplexed)

For all projects with single-cell and multiplexed, the download will be the same for AnnData objects. The only time this should be different is if a project has CITE-seq.

SCPCP000000_merged.hdf5
SCPCP000000_summary_report.html
individual_reports/
└── SCPCL000000/
        ├──SCPCL000000_qc.html
        └──SCPCL000000_cell_type_report.html

AnnData/Python (CITE)

For downloading AnnData of only projects that have CITE.

SCPCP000000_merged_rna.hdf5
SCPCP000000_merged_adt.hdf5
SCPCP000000_summary_report.html
individual_reports/
└── SCPCL000000/
        ├──SCPCL000000_qc.html
        └──SCPCL000000_cell_type_report.html

Let me know if you have any questions or if I missed anything!

dvenprasad commented 10 months ago

@allyhawkins I want to clarify if the QC reports for merged objects are named SCPCL000000_qc.html

The QC reports in the current download images are labeled SCPCL000000_qc_report.html. Do they need to change as well?

allyhawkins commented 10 months ago

@allyhawkins I want to clarify if the QC reports for merged objects are named SCPCL000000_qc.html

The QC reports in the current download images are labeled SCPCL000000_qc_report.html. Do they need to change as well?

Good catch, that's correct! The file names should be SCPCL000000_qc.html.

dvenprasad commented 10 months ago

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

I will put the corrected (qc and celltype) download folder images in another ticket. Let me know if these are good.

sjspielman commented 10 months ago

These look good to me, @dvenprasad! The only change I might actually make is renaming _summary_report.html -> _merged_summary_report.html, so it's super clear the report is only about the merged object, and not a summary of every library in the project.

@allyhawkins can you have a look too and weigh in if you agree with me on 👆 or not?

allyhawkins commented 10 months ago

These look good to me, @dvenprasad! The only change I might actually make is renaming _summary_report.html -> _merged_summary_report.html, so it's super clear the report is only about the merged object, and not a summary of every library in the project.

@allyhawkins can you have a look too and weigh in if you agree with me on 👆 or not?

I'm fine with that change. We just need to remember to change that in the workflow. Also @dvenprasad see my comment (https://github.com/AlexsLemonade/scpca-docs/issues/226#issuecomment-1885843813) about changing from SCPCL000000_celltype_report.html to SCPCL000000_celltype-report.html.

dvenprasad commented 10 months ago

Okay added merged_summary_report.html and corrected the cell type report to celltype-report

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

sjspielman commented 10 months ago

wait! @dvenprasad @allyhawkins , did we mean SCPCP000000_merged-summary-report.html?? (edit - vs SCPCP000000_merged_summary_report.html)

allyhawkins commented 10 months ago

These - and _ will be the death of me. I think if we do it for one report we need to do it for the other. Sorry @dvenprasad!

dvenprasad commented 10 months ago

Alternate suggestion: if we are struggling to keep track of the _ and the -, why don't we make it easy on ourselves and just pick one? The meaning does not change and it's only a formatting choice.

It's likely that we'll have a similar conversation in the future. We can be kind to both our present and future selves (and others who might need to do this)

allyhawkins commented 10 months ago

Alternate suggestion: if we are struggling to keep track of the _ and the -, why don't we make it easy on ourselves and just pick one? The meaning does not change and it's only a formatting choice.

@dvenprasad I totally agree that it would be easier to just have _ like we do for all the other ones. But @jashapiro had requested that we use - for within chunk spacing. See https://github.com/AlexsLemonade/scpca-nf/pull/643#discussion_r1442208454. That's what prompted this change in the first place. But if it's easier I'm good with sticking with all _.

jashapiro commented 10 months ago

Jenny Bryan did a number on me.

sjspielman commented 10 months ago

Yeah, the idea is it's more of a naming convention to use _ to separate chunks of meaning, and - for spaces within each chunk. We haven't been very consistent with this, but I don't think it's a bad idea to be consistent with it forward.

dvenprasad commented 10 months ago

finalv3_no_really_finalv4_actually_final.psd

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) See below comment for correct image

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

allyhawkins commented 10 months ago

🎉

sjspielman commented 10 months ago

One more round I think 😬...

I think for AnnData/Python (single-cell, multiplexed), we still need the _rna for the merged. So, SCPCP000000_merged.hdf5 -> SCPCP000000_merged_rna.hdf5

https://github.com/AlexsLemonade/scpca-nf/blob/f08ce452f95f88e169217c114aef1c1321a3688c/merge.nf#L97-L105

dvenprasad commented 10 months ago

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

sjspielman commented 10 months ago

🐻‍❄️ i hear we like animal emojis around here

dvenprasad commented 10 months ago

All right here is merged object with single_cell_metadata.tsv and bulk files.

SingleCellExperiment/R (single-cell, multiplexed, CITE) merged-sc-project-download-folder

AnnData/Python (single-cell, multiplexed) merged-anndata-project-download-folder

AnnData/Python (CITE) merged-anndata-cite-seq-project-download-folder

sjspielman commented 10 months ago

@dvenprasad I think we've made it!!

In the interest of making sure, @allyhawkins can you look too?

allyhawkins commented 10 months ago

These look good to me! Thank you for dealing with all of our changes @dvenprasad

sjspielman commented 9 months ago

final.final closed by #256