DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
6 stars 2 forks source link

Integration Test fails during PFB manifest generation #3247

Closed amarjandu closed 3 years ago

amarjandu commented 3 years ago
======================================================================
FAIL: test_indexing (integration_test.IndexingIntegrationTest) [manifest] (catalog='it2ebi', format='terra.pfb', attempts=1)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/ucsc/azul/test/integration_test.py", line 175, in subTest
    yield
  File "/builds/ucsc/azul/test/integration_test.py", line 335, in _test_manifest
    response = self._check_endpoint(config.service_endpoint(), '/manifest/files', params)
  File "/builds/ucsc/azul/test/integration_test.py", line 368, in _check_endpoint
    return self._get_url_content(url.url)
  File "/builds/ucsc/azul/test/integration_test.py", line 374, in _get_url_content
    return self._get_url(url).data
  File "/builds/ucsc/azul/test/integration_test.py", line 390, in _get_url
    self._assertResponseStatus(response, expected_statuses)
  File "/builds/ucsc/azul/test/integration_test.py", line 398, in _assertResponseStatus
    assert response.status in expected_statuses, (
AssertionError: (500, 'Internal Server Error', b'Traceback (most recent call last):\n  File "/var/task/chalice/app.py", line 1135, in _get_view_function_response\n    response = view_function(**function_args)\n  File "/var/task/app.py", line 1492, in file_manifest\n    return _file_manifest(fetch=False)\n  File "/var/task/app.py", line 1552, in _file_manifest\n    return app.manifest_controller.get_manifest_async(self_url=app.self_url(),\n  File "/var/task/azul/service/manifest_controller.py", line 127, in get_manifest_async\n    token_or_state = self.async_service.inspect_generation(token)\n  File "/var/task/azul/service/async_manifest_service.py", line 99, in inspect_generation\n    raise StateMachineError(status, output)\nazul.service.step_function_helper.StateMachineError: (\'Failed to generate manifest\', \'FAILED\', None)\n')
----------------------------------------------------------------------

see :

https://console.aws.amazon.com/states/home?region=us-east-1#/statemachines/view/arn:aws:states:us-east-1:122796619775:stateMachine:azul-manifest-dev?statusFilter=FAILED

https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:logs-insights$3FqueryDetail$3D$257E$2528end$257E$25272021-07-17T06*3a59*3a59.000Z$257Estart$257E$25272021-07-15T07*3a00*3a00.000Z$257EtimeType$257E$2527ABSOLUTE$257Etz$257E$2527Local$257EeditorString$257E$2527fields*20*40timestamp*2c*20*40message*0a*7c*20sort*20*40timestamp*20desc*0a*7c*20filter*20*40message*20like*20*27Error*27$257EisLiveTail$257Efalse$257EqueryId$257E$2527845d53c7-6c2d-47ec-9d0a-58920cb80ca4$257Esource$257E$2528$257E$2527*2faws*2flambda*2fazul-service-dev-manifest$2529$2529

CloudWatch Logs Insights
region: us-east-1
log-group-names: /aws/lambda/azul-service-dev-manifest
start-time: 2021-07-15T07:00:00.000Z
end-time: 2021-07-17T06:59:59.000Z
query-string:

fields @timestamp, @message
| sort @timestamp desc
| filter @message like 'Error'
[ERROR] AssertionError: ({'entity_id': '83014a45-cfc9-5140-928e-22f612e23e9c', 'contents': {'sample_specimens': [{'has_input_biomaterial': [None], '_source': ['specimen_from_organism'], 'document_id': ['bb7e366f-bb0d-4442-a0de-006192abf055', 'edf650e3-af33-41de-bc46-3e0f03be9a15'], 'biomaterial_id': ['CBTM-364B_CAE', 'CBTM-364B_MLN'], 'disease': ['normal'], 'organ': ['immune organ', 'large intestine'], 'organ_part': ['caecum', 'mesenteric lymph node'], 'storage_method': [None], 'preservation_method': [None], '_type': ['specimen']}], 'samples': [{'document_id': ['bb7e366f-bb0d-4442-a0de-006192abf055', 'edf650e3-af33-41de-bc46-3e0f03be9a15'], 'biomaterial_id': ['CBTM-364B_CAE', 'CBTM-364B_MLN'], 'entity_type': ['specimens'], 'organ': ['immune organ', 'large intestine'], 'organ_part': ['caecum', 'mesenteric lymph node'], 'model_organ': [None], 'model_organ_part': [None], 'effective_organ': ['immune organ', 'large intestine']}], 'sequencing_inputs': [{'document_id': ['2e519099-e2c1-47ae-bd15-e9b5217bd8ea', '519a3665-5f9a-42b8-abfa-ad0db9881fa4'], 'biomaterial_id': ['364b_Caecum_P1', '364b_mLN_L4'], 'sequencing_input_type': ['cell_suspension']}], 'specimens': [{'has_input_biomaterial': [None], '_source': ['specimen_from_organism'], 'document_id': ['bb7e366f-bb0d-4442-a0de-006192abf055', 'edf650e3-af33-41de-bc46-3e0f03be9a15'], 'biomaterial_id': ['CBTM-364B_CAE', 'CBTM-364B_MLN'], 'disease': ['normal'], 'organ': ['immune organ', 'large intestine'], 'organ_part': ['caecum', 'mesenteric lymph node'], 'storage_method': [None], 'preservation_method': [None], '_type': ['specimen']}], 'cell_suspensions': [{'document_id': ['2e519099-e2c1-47ae-bd15-e9b5217bd8ea'], 'biomaterial_id': ['364b_mLN_L4'], 'total_estimated_cells': 1, 'selected_cell_type': ['T cell'], 'organ': ['immune organ'], 'organ_part': ['mesenteric lymph node']}, {'document_id': ['519a3665-5f9a-42b8-abfa-ad0db9881fa4'], 'biomaterial_id': ['364b_Caecum_P1'], 'total_estimated_cells': 1, 'selected_cell_type': ['T cell'], 'organ': ['large intestine'], 'organ_part': ['caecum']}], 'cell_lines': [], 'donors': [{'document_id': ['47d6f836-243e-4ec6-bc77-03eb72cad26d'], 'biomaterial_id': ['CBTM-364B'], 'biological_sex': ['male'], 'genus_species': ['Homo sapiens'], 'development_stage': ['human adult stage'], 'diseases': ['normal'], 'organism_age': [{'value': '55-60', 'unit': 'year'}], 'organism_age_value': ['55-60'], 'organism_age_unit': ['year'], 'organism_age_range': [{'gte': 1734480000.0, 'lte': 1892160000.0}], 'donor_count': 1}], 'organoids': [], 'files': [{'content-type': 'application/unknown; dcp-type=data', 'indexed': False, 'name': 'Distinct_microbial_and_immune_niches_of_the_human_colon.loom', 'crc32c': '37773c86', 'sha256': '3a9fa6f7cd989c9b699b3dc8684d1314eba0387a7cd1c79de688486d01689a55', 'size': 7038415, 'uuid': '79834f57-1150-59ac-a5e9-06ac9407b3e8', 'drs_path': 'v1_c5cfdcda-995d-4f6e-94b7-e0532f6d83ca_6f74b14e-e803-4741-bec1-28b8a42284f6', 'version': '2020-10-20T17:59:59.000000Z', 'document_id': '83014a45-cfc9-5140-928e-22f612e23e9c', 'file_type': 'analysis_file', 'file_format': 'unknown', 'content_description': [None], 'is_intermediate': None, 'source': None, '_type': 'file', 'related_files': [], 'matrix_cell_count': None}], 'analysis_protocols': [{'workflow': ['optimus_v4.0.0']}], 'imaging_protocols': [], 'library_preparation_protocols': [{'library_construction_approach': ['Smart-seq2'], 'nucleic_acid_source': ['single cell']}], 'sequencing_protocols': [{'instrument_manufacturer_model': ['Illumina HiSeq 4000'], 'paired_end': [True]}], 'sequencing_processes': [{'document_id': ['00027d78-bcde-444c-b51c-b7446cdf9ea9', '0009e534-5d10-4881-88bc-ed5d67d1b37c']}], 'projects': [{'project_title': ['Distinct microbial and immune niches of the human colon'], 'project_short_name': ['ColonImmune10XSS2VDJ'], 'laboratory': ['Cellular Genetics', 'Centre for Immunobiology, Blizard Institute', 'Department of Haematology', 'Department of Molecular and Translational Sciences', 'Department of Surgery & NIHR Cambridge Biomedical Research Centre', 'Human Cell Atlas (Sarah Teichmann)', 'Human Cell Atlas Data Coordination Platform', 'Human Genetics', 'Infection Genomics', 'MRC Laboratory of Molecular Biology'], 'institutions': ['EMBL-EBI', 'Monash University', 'Queen Mary University of London', 'University of Cambridge', 'Wellcome Sanger Institute'], 'document_id': ['83f5188e-3bf7-4956-9544-cea4f8997756'], 'publication_titles': ['Distinct microbial and immune niches of the human colon'], 'insdc_project_accessions': ['ERP115651', 'ERP118125', 'ERP118165', 'ERP118273', 'ERP118315'], 'geo_series_accessions': [None], 'array_express_accessions': ['E-MTAB-8007', 'E-MTAB-8474', 'E-MTAB-8476', 'E-MTAB-8484', 'E-MTAB-8486'], 'insdc_study_accessions': ['PRJEB32912', 'PRJEB35119', 'PRJEB35153', 'PRJEB35244', 'PRJEB35284'], 'supplementary_links': ['https://www.gutcellatlas.org/'], '_type': ['project']}]}, 'num_contributions': 1, 'sources': [{'id': 'c5cfdcda-995d-4f6e-94b7-e0532f6d83ca', 'spec': 'tdr:broad-jade-dev-data:snapshot/hca_dev_20201023_ebiv4___20210302:'}], 'bundles': [{'uuid': '77892d68-4fac-5fa2-ac06-6aaf78880c19', 'version': '2020-10-28T11:26:39.000000Z'}], 'total_estimated_cells': 2}, 'cell_suspensions')
Traceback (most recent call last):
  File "/var/task/app.py", line 1562, in generate_manifest
    return app.manifest_controller.get_manifest(event)
  File "/var/task/azul/service/manifest_controller.py", line 67, in get_manifest
    result = self.service.get_manifest(format_=ManifestFormat(state['format_']),
  File "/var/task/azul/service/manifest_service.py", line 364, in get_manifest
    partition = generator.write(object_key, partition)
  File "/var/task/azul/service/manifest_service.py", line 967, in write
    file_path, base_name = self.create_file()
  File "/var/task/azul/service/manifest_service.py", line 1449, in create_file
    converter.add_doc(doc)
  File "/var/task/azul/service/avro_pfb.py", line 105, in add_doc
    assert False, (doc, entity_type)

Time of failure was ~4:35 PST

https://gitlab.dev.singlecell.gi.ucsc.edu/ucsc/azul/-/jobs/24559

melainalegaspi commented 3 years ago

@amarjandu spike to add step function execution logs.

amarjandu commented 3 years ago

added the AssertionError and Traceback to the description

melainalegaspi commented 3 years ago

@jessebrennan to check if this is a duplicate of an existing ticket.

jessebrennan commented 3 years ago

This is a duplicate of #3157