hubmapconsortium / ingest-api

MIT License
0 stars 0 forks source link

Analysis: Performance tuning of Ingest Board #512

Closed shirey closed 6 months ago

shirey commented 7 months ago

The ingest board can be very slow to open, especially over slower network connections:

Is there too much data pushed over slow connections? Caching layer not working correctly?

Currently PROD HuBMAP pushes ~70MB to UI on load-- what extra data is there that isn't needed? @maxsibilla may have some ideas/info here.

Should we page data coming in? (lots of work), or via elastic/open search?

Should we "load on demand" without paging?

maxsibilla commented 6 months ago

I looked at the response generated from the /data-status endpoint and I think the inflated size of the JSON is caused by the amount of properties inside processed_datasets. Here is the JSON returned for a single dataset:

{
    "activity_creation_action": "Create Dataset Activity",
    "assigned_to_group_name": "",
    "created_timestamp": 1627494634414,
    "data_access_level": "protected",
    "dataset_type": "RNAseq",
    "donor_hubmap_id": "HBM289.BLHF.363",
    "donor_lab_id": "KTRC_PPID_3499",
    "donor_submission_id": "UCSD0006",
    "globus_url": "https://app.globus.org/file-manager?origin_id=24c2ee95-146d-4513-a1b3-ac0bfdb7856f&origin_path=%2Fprotected%2FUniversity%20of%20California%20San%20Diego%20TMC%2F421007293469db7b528ce6478c00348d%2F",
    "group_name": "University of California San Diego TMC",
    "has_contacts": "True",
    "has_contributors": "True",
    "has_data": "True",
    "has_dataset_metadata": "True",
    "has_donor_metadata": "True",
    "has_rui_info": "True",
    "hubmap_id": "HBM575.XFCT.276",
    "ingest_task": "",
    "is_primary": "True",
    "last_touch": 1643320101710,
    "organ": "Kidney (Right)",
    "organ_hubmap_id": "HBM482.CMSP.343",
    "organ_uuid": "b34c049c809533ca3ab0220670cb1632",
    "processed_datasets": [
        {
            "contains_human_genetic_sequences": false,
            "created_by_user_displayname": "HuBMAP Process",
            "created_by_user_email": "hubmap@hubmapconsortium.org",
            "created_by_user_sub": "3e7bce63-129d-33d0-8f6c-834b34cd382e",
            "created_timestamp": 1627497329568,
            "data_access_level": "public",
            "data_types": "['salmon_rnaseq_snareseq']",
            "dataset_info": "snRNA-seq (SNARE-seq) [Salmon] data from the kidney (right) of a 57-year-old white female",
            "dataset_type": "RNAseq [Salmon]",
            "entity_type": "Dataset",
            "group_name": "University of California San Diego TMC",
            "group_uuid": "03b3d854-ed44-11e8-8bce-0e368f3075e8",
            "hubmap_id": "HBM947.GLGL.465",
            "last_modified_timestamp": 1698974013104,
            "last_modified_user_displayname": "Karl Burke",
            "last_modified_user_email": "KBURKE@pitt.edu",
            "last_modified_user_sub": "3aaa925d-755a-4193-98c7-44455de783ff",
            "pipeline_message": "the process ran",
            "published_timestamp": 1643320156010,
            "published_user_displayname": "HuBMAP Process",
            "published_user_email": "hubmap@hubmapconsortium.org",
            "published_user_sub": "3e7bce63-129d-33d0-8f6c-834b34cd382e",
            "status": "Published",
            "uuid": "dd3e6da115fd13eee62d9a4fba2324ae"
        },
        {
            "contains_human_genetic_sequences": false,
            "created_by_user_displayname": "HuBMAP Process",
            "created_by_user_email": "hubmap@hubmapconsortium.org",
            "created_by_user_sub": "3e7bce63-129d-33d0-8f6c-834b34cd382e",
            "created_timestamp": 1686252397263,
            "data_access_level": "public",
            "data_types": "['salmon_rnaseq_snareseq']",
            "dataset_info": "salmon_rnaseq_snareseq__efe5894cd09b0a411ecfe46749e6db73_397b6f876ec7976923650135481d08e5_421007293469db7b528ce6478c00348d_76584f2fe8e8ec215d78b83296461bbf_83886790424cd10f6b50159ef50555ed_f852d653926edb68292e39b95a3469cf__salmon-rnaseq-snareseq",
            "dataset_type": "RNAseq [Salmon]",
            "entity_type": "Dataset",
            "group_name": "University of California San Diego TMC",
            "group_uuid": "03b3d854-ed44-11e8-8bce-0e368f3075e8",
            "hubmap_id": "HBM334.DWWF.436",
            "ingest_metadata": "{'dag_provenance_list': [{'hash': '8b0d9f5', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}, {'hash': '8b0d9f5', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}, {'hash': 'd18fd49', 'name': 'pipeline.cwl', 'origin': 'https://github.com/hubmapconsortium/salmon-rnaseq'}, {'hash': '94520dc', 'name': 'pipeline.cwl', 'origin': 'https://github.com/hubmapconsortium/azimuth-annotate'}, {'hash': '0b94a44', 'name': 'anndata-to-ui.cwl', 'origin': 'https://github.com/hubmapconsortium/portal-containers'}], 'files': [{'description': 'Disperson plot of gene expression', 'edam_term': 'EDAM_1.24.format_3508', 'is_qa_qc': False, 'rel_path': 'dispersion_plot.pdf', 'size': 1161229, 'type': 'pdf'}, {'description': 'Genome build information in JSON format', 'edam_term': 'EDAM_1.24.format_3464', 'is_qa_qc': False, 'rel_path': 'genome_build.json', 'size': 106, 'type': 'json'}, {'description': \"Normalized gene expression with additional metadata, in HDF5 format, readable with the 'anndata' Python package\", 'edam_term': 'EDAM_1.24.format_3590', 'is_qa_qc': False, 'rel_path': 'secondary_analysis.h5ad', 'size': 2754480510, 'type': 'h5ad', 'is_data_product': True}, {'description': 'Quality control report in JSON format', 'edam_term': 'EDAM_1.24.format_3464', 'is_qa_qc': True, 'rel_path': 'qc_results.json', 'size': 451, 'type': 'json'}, {'description': 'scVelo RNA velocity grid plot', 'edam_term': 'EDAM_1.24.format_3508', 'is_qa_qc': False, 'rel_path': 'scvelo_embedding_grid.pdf', 'size': 313729, 'type': 'pdf'}, {'description': \"RNA velocity results, in HDF5 format, readable with the 'anndata' Python package\", 'edam_term': 'EDAM_1.24.format_3590', 'is_qa_qc': False, 'rel_path': 'scvelo_annotated.h5ad', 'size': 463122785, 'type': 'h5ad', 'is_data_product': True}, {'description': 'UMAP plot of cells, colored by Leiden cluster ID', 'edam_term': 'EDAM_1.24.format_3508', 'is_qa_qc': False, 'rel_path': 'umap_by_leiden_cluster.pdf', 'size': 285686, 'type': 'pdf'}, {'description': \"Raw gene expression, in HDF5 format, readable with the 'anndata' Python package, intronic counts as AnnData layer\", 'edam_term': 'EDAM_1.24.format_3590', 'is_qa_qc': False, 'rel_path': 'expr.h5ad', 'size': 279275224, 'type': 'h5ad', 'is_data_product': True}, {'description': 'UMAP plot of cells, colored by embedding density', 'edam_term': 'EDAM_1.24.format_3508', 'is_qa_qc': False, 'rel_path': 'umap_embedding_density.pdf', 'size': 2634935, 'type': 'pdf'}, {'description': 'Quality control results from Scanpy, per-cell and per-gene, in HDF5 format', 'edam_term': 'EDAM_1.24.format_3590', 'is_qa_qc': True, 'rel_path': 'qc_results.hdf5', 'size': 10666768, 'type': 'hdf5'}, {'description': \"Raw gene expression, in HDF5 format, readable with the 'anndata' Python package, intronic counts as AnnData layer\", 'edam_term': 'EDAM_1.24.format_3590', 'is_qa_qc': False, 'rel_path': 'raw_expr.h5ad', 'size': 136384192, 'type': 'h5ad'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P6_N730_S12_R1_fastqc.html', 'size': 722453, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P3_N727_S9_R2_fastqc.html', 'size': 812495, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P2_N726_S8_R1_fastqc.html', 'size': 721400, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P1_N725_S7_R1_fastqc.html', 'size': 718044, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P3_N727_S9_R1_fastqc.html', 'size': 721822, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P5_N729_S11_R2_fastqc.html', 'size': 801030, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P4_N728_S10_R2_fastqc.html', 'size': 801385, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P2_N726_S8_R2_fastqc.html', 'size': 799830, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P6_N730_S12_R2_fastqc.html', 'size': 809822, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P1_N725_S7_R2_fastqc.html', 'size': 804534, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P4_N728_S10_R1_fastqc.html', 'size': 720307, 'type': 'unknown'}, {'description': 'FastQC report of input FASTQ file', 'edam_term': 'EDAM_1.24.format_2331', 'is_qa_qc': True, 'rel_path': 'fastqc_output/BUKMAP_20190822A.P5_N729_S11_R1_fastqc.html', 'size': 715565, 'type': 'unknown'}, {'description': 'Input data relevant for visualization saved in columnar comma-separated-file format.', 'edam_term': 'EDAM_1.24.format_3752', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/output/secondary_analysis.csv', 'size': 983184, 'type': 'csv'}, {'description': 'JSON-formatted information about this scRNA-seq run including scatterplot coordinates and clustering.', 'edam_term': 'EDAM_1.24.format_3464', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/output/secondary_analysis.cells.json', 'size': 1979893, 'type': 'json'}, {'description': \"JSON-formatted information about this scRNA-seq's clustering.\", 'edam_term': 'EDAM_1.24.format_3464', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/output/secondary_analysis.factors.json', 'size': 499703, 'type': 'json'}, {'description': \"JSON-formatted information about the heirarchy scRNA-seq's cells.\", 'edam_term': 'EDAM_1.24.format_3464', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/output/secondary_analysis.cell-sets.json', 'size': 514475, 'type': 'json'}, {'description': 'Input data relevant for visualization saved in columnar Apache Arrow format.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/output/secondary_analysis.arrow', 'size': 682130, 'type': 'arrow'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/layers/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/layers/spliced/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/layers/unspliced/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/layers/spliced_unspliced_sum/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/obsp/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/obsp/distances/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/obsp/connectivities/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/obsm/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/velocity_graph_neg/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/velocity_params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/neighbors/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/neighbors/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/umap/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/umap/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/leiden/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/leiden/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/velocity_graph/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/pca/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/pca/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/uns/recover_dynamics/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/var/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/var/__categories/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/obs/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/obs/__categories/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of velocity analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/scvelo_annotated.zarr/varm/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/layers/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/layers/spliced/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/layers/unspliced/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/layers/unscaled/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/layers/spliced_unspliced_sum/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/obsp/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/obsp/distances/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/obsp/connectivities/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/obsm/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/umap_density_params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/reviewer2/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/reviewer3/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/disclaimers/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/azimuth_to_CLID_mapping/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/reviewer4/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/seurat/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/CLID/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/azimuth_reference/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/reviewer1/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/annotation_metadata/azimuth/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/neighbors/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/neighbors/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/umap/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/umap/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/leiden/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/leiden/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/pca/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/pca/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/rank_genes_groups/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/rank_genes_groups/params/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/uns/hvg/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/var/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/var/__categories/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/obs/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/obs/__categories/.zgroup', 'size': 24, 'type': 'unknown'}, {'description': 'AnnData Zarr store for storing and visualizing single cell sequencing outputs of UMAP/clustering analysis.', 'edam_term': 'EDAM_1.24.format_2333', 'is_qa_qc': False, 'rel_path': 'hubmap_ui/anndata-zarr/secondary_analysis.zarr/varm/.zgroup', 'size': 24, 'type': 'unknown'}]}",
            "last_modified_timestamp": 1699327413576,
            "last_modified_user_displayname": "Karl Burke",
            "last_modified_user_email": "KBURKE@pitt.edu",
            "last_modified_user_sub": "3aaa925d-755a-4193-98c7-44455de783ff",
            "pipeline_message": "the process ran",
            "published_timestamp": 1692324155031,
            "published_user_displayname": "HuBMAP Process",
            "published_user_email": "hubmap@hubmapconsortium.org",
            "published_user_sub": "3e7bce63-129d-33d0-8f6c-834b34cd382e",
            "status": "Published",
            "uuid": "615462a0e4aa133d8b19644c404e3eeb"
        }
    ],
    "provider_experiment_id": "BUKMAP_20190822A_SNARE2-R_N727",
    "published_timestamp": 1643320101710,
    "status": "Published",
    "status_history": "",
    "upload": "HBM947.VDBQ.325",
    "uuid": "421007293469db7b528ce6478c00348d"
}

Simply removing ingest_metadata would be enough to reduce the overall size of the response but the modal only needs a handful of information. My proposal would be to modify the cypher query responsible for grabing processed_datasets to only return the following fields: UUID, HuBMAP/SenNet ID, status, creation_date, and globus_url

Note I think there might be a bug in the curernt Ingest API code in that the globus_url is not being generated for processed datasets so there is no link in the UI for "Globus Directory"

image
shirey commented 6 months ago

Thanks @maxsibilla for the analysis and @libpitt for the recent PR. We'll evaluate to see the timing after the PR is released and possibly close this issue.

shirey commented 6 months ago

Closing, as it was fixed by the above listed PR