hubmapconsortium / search-api

HuBMAP search service and associated pieces to create an index
https://search.api.hubmapconsortium.org
MIT License
2 stars 2 forks source link

transform method throws TypeError: argument of type 'NoneType' is not iterable #174

Closed etlds closed 4 years ago

etlds commented 4 years ago

uuid: fa3f5482e3502d7d5d86c61157ee6a7a

doc before pass in transform method looks like this.

{'ancestor_ids': [], 'descendant_ids': ['a6f3b59708dda79f4a5b7afaa7e0dc94'], 'ancestors': [], 'descendants': [{'display_doi': 'HBM965.TSLJ.667', 'entity_type': 'Dataset', 'create_timestamp': 1596125205002, 'uuid': 'a6f3b59708dda79f4a5b7afaa7e0dc94', 'data_access_level': 'consortium', 'data_types': ['salmon_rnaseq_snareseq'], 'metadata': {'dag_provenance_list': [{'hash': 'de1b3fc', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}, {'hash': 'de1b3fc', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}, {'name': 'pipeline.cwl', 'hash': '536d6ed', 'origin': 'https://github.com/hubmapconsortium/salmon-rnaseq.git'}, {'name': 'h5ad-to-arrow.cwl', 'hash': 'fb19103', 'origin': 'https://github.com/hubmapconsortium/portal-containers.git'}], 'files': [{'rel_path': 'umap_by_leiden_cluster.pdf', 'type': 'pdf', 'size': 88197, 'description': 'UMAP plot of cells, colored by Leiden cluster ID', 'edam_term': 'EDAM_1.24.format_3508'}, {'rel_path': 'out.h5ad', 'type': 'h5ad', 'size': 33893408, 'description': "Raw gene expression, in HDF5 format, readable with the 'anndata' Python package", 'edam_term': 'EDAM_1.24.format_3590'}, {'rel_path': 'cluster-marker-genes/cluster_marker_genes.h5ad', 'type': 'h5ad', 'size': 222459860, 'description': "Normalized gene expression with additional metadata, in HDF5 format, readable with the 'anndata' Python package", 'edam_term': 'EDAM_1.24.format_3590'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.cell-sets.json', 'type': 'json', 'size': 145547, 'description': "JSON-formatted information about the heirarchy scRNA-seq's cells.", 'edam_term': 'EDAM_1.24.format_3464'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.factors.json', 'type': 'json', 'size': 140670, 'description': "JSON-formatted information about this scRNA-seq's clustering.", 'edam_term': 'EDAM_1.24.format_3464'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.arrow', 'type': 'arrow', 'size': 160162, 'description': 'Input data relevant for visualization saved in columnar Apache Arrow format.', 'edam_term': 'EDAM_1.24.format_2333'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.csv', 'type': 'csv', 'size': 196716, 'description': 'Input data relevant for visualization saved in columnar comma-separated-file format.', 'edam_term': 'EDAM_1.24.format_3752'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.cells.json', 'type': 'json', 'size': 552275, 'description': 'JSON-formatted information about this scRNA-seq run including scatterplot coordinates and clustering.', 'edam_term': 'EDAM_1.24.format_3464'}]}, 'contains_human_genetic_sequences': 'no', 'group_uuid': '03b3d854-ed44-11e8-8bce-0e368f3075e8', 'last_modified_timestamp': 1596130261974, 'created_by_user_displayname': 'HuBMAP Process', 'created_by_user_email': 'hubmap@hubmapconsortium.org', 'status': 'QA'}], 'immediate_descendants': [{'display_doi': 'HBM965.TSLJ.667', 'entity_type': 'Dataset', 'create_timestamp': 1596125205002, 'uuid': 'a6f3b59708dda79f4a5b7afaa7e0dc94', 'data_access_level': 'consortium', 'data_types': ['salmon_rnaseq_snareseq'], 'metadata': {'dag_provenance_list': [{'hash': 'de1b3fc', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}, {'hash': 'de1b3fc', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}, {'name': 'pipeline.cwl', 'hash': '536d6ed', 'origin': 'https://github.com/hubmapconsortium/salmon-rnaseq.git'}, {'name': 'h5ad-to-arrow.cwl', 'hash': 'fb19103', 'origin': 'https://github.com/hubmapconsortium/portal-containers.git'}], 'files': [{'rel_path': 'umap_by_leiden_cluster.pdf', 'type': 'pdf', 'size': 88197, 'description': 'UMAP plot of cells, colored by Leiden cluster ID', 'edam_term': 'EDAM_1.24.format_3508'}, {'rel_path': 'out.h5ad', 'type': 'h5ad', 'size': 33893408, 'description': "Raw gene expression, in HDF5 format, readable with the 'anndata' Python package", 'edam_term': 'EDAM_1.24.format_3590'}, {'rel_path': 'cluster-marker-genes/cluster_marker_genes.h5ad', 'type': 'h5ad', 'size': 222459860, 'description': "Normalized gene expression with additional metadata, in HDF5 format, readable with the 'anndata' Python package", 'edam_term': 'EDAM_1.24.format_3590'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.cell-sets.json', 'type': 'json', 'size': 145547, 'description': "JSON-formatted information about the heirarchy scRNA-seq's cells.", 'edam_term': 'EDAM_1.24.format_3464'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.factors.json', 'type': 'json', 'size': 140670, 'description': "JSON-formatted information about this scRNA-seq's clustering.", 'edam_term': 'EDAM_1.24.format_3464'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.arrow', 'type': 'arrow', 'size': 160162, 'description': 'Input data relevant for visualization saved in columnar Apache Arrow format.', 'edam_term': 'EDAM_1.24.format_2333'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.csv', 'type': 'csv', 'size': 196716, 'description': 'Input data relevant for visualization saved in columnar comma-separated-file format.', 'edam_term': 'EDAM_1.24.format_3752'}, {'rel_path': 'cluster-marker-genes/output/cluster_marker_genes.cells.json', 'type': 'json', 'size': 552275, 'description': 'JSON-formatted information about this scRNA-seq run including scatterplot coordinates and clustering.', 'edam_term': 'EDAM_1.24.format_3464'}]}, 'contains_human_genetic_sequences': 'no', 'group_uuid': '03b3d854-ed44-11e8-8bce-0e368f3075e8', 'last_modified_timestamp': 1596130261974, 'created_by_user_displayname': 'HuBMAP Process', 'created_by_user_email': 'hubmap@hubmapconsortium.org', 'status': 'QA'}], 'immediate_ancestors': [], 'donor': None, 'origin_sample': {}, 'source_sample': {}, 'display_doi': 'HBM725.BRBB.566', 'entity_type': 'Dataset', 'create_timestamp': 1595272807663, 'uuid': 'fa3f5482e3502d7d5d86c61157ee6a7a', 'data_access_level': 'protected', 'data_types': ['SNAREseq'], 'description': 'Single-nucleus SNARE-seq2 (dual-omic RNA and accessible chromatin sequencing) on adult human kidney medulla', 'metadata': {'dag_provenance_list': [{'hash': 'de1b3fc', 'origin': 'https://github.com/hubmapconsortium/ingest-pipeline.git'}], 'metadata': {'_from_metadatatsv': True, 'acquisition_instrument_model': 'NovaSeq', 'acquisition_instrument_vendor': 'Illumina', 'analyte_class': 'RNA', 'assay_category': 'sequence', 'assay_type': 'SNARE2-RNAseq', 'cell_barcode_offset': '10,48,86', 'cell_barcode_read': 'R2', 'cell_barcode_size': '8,8,8', 'collectiontype': 'single_metadatatsv', 'data_path': '.', 'donor_id': 'UCSD0004', 'execution_datetime': '2019-05-29 11:00', 'is_targeted': 'FALSE', 'is_technical_replicate': 'FALSE', 'library_adapter_sequence': 'CTGTCTCTTATACACATCT', 'library_average_fragment_size': '700', 'library_construction_protocols_io_doi': '10.17504/protocols.io.be5gjg3w', 'library_final_yield_unit': 'ng', 'library_final_yield_value': '795', 'library_id': 'KM31', 'library_layout': 'paired-end', 'library_pcr_cycles': '18', 'library_pcr_cycles_for_sample_index': '12', 'metadata_path': '.', 'operator': 'Nongluk Plongthongkum', 'operator_email': 'nplongth@eng.ucsd.edu', 'pi': 'Kun Zhang', 'pi_email': 'kzhang@eng.ucsd.edu', 'protocols_io_doi': '10.17504/protocols.io.be5gjg3w', 'rnaseq_assay_input': '6000', 'rnaseq_assay_method': 'SNARE2-RNAseq-RNA', 'sc_isolation_cell_number': '1580000', 'sc_isolation_enrichment': 'none', 'sc_isolation_entity': 'nucleus', 'sc_isolation_protocols_io_doi': '10.17504/protocols.io.ufketkw', 'sc_isolation_quality_metric': 'OK', 'sc_isolation_tissue_dissociation': 'dounce', 'sequencing_phix_percent': '20', 'sequencing_read_format': '70/6/104', 'sequencing_read_percent_q30': '89.99', 'sequencing_reagent_kit': 'NovaSeq 6000 S4 Reagent', 'tissue_id': 'UCSD0004-RK-2-1-1'}}, 'contains_human_genetic_sequences': 'yes', 'group_uuid': '03b3d854-ed44-11e8-8bce-0e368f3075e8', 'last_modified_timestamp': 1597697945935, 'created_by_user_displayname': 'Blue Lake', 'created_by_user_email': 'b1lake@ucsd.edu', 'status': 'Published', 'group_name': 'University of California San Diego TMC', 'update_timestamp': 1599783743212, 'update_timestamp_fmted': '2020-09-10 20:22:23', 'index_version': '1.5.2.7'}

mccalluc commented 4 years ago

hmm... That document maps without any errors. Let me look at the stack trace more closely...

mccalluc commented 4 years ago

So: Your document has 'donor': None. I don't think that's correct, but I don't understand why it causes an error for you but not for me.

mccalluc commented 4 years ago

@etlds : From slack, it sounded like you thought the problem was on your end, so I'll close this... but please reopen if you'd like anything to change on my end. (I'll have a PR soon to help you debug in the future.)