DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Compact manifest for dcp2 catalog fails when filtered on file formats #2649

Open achave11-ucsc opened 3 years ago

achave11-ucsc commented 3 years ago

The request …

http 'https://service.dev.singlecell.gi.ucsc.edu/fetch/manifest/files?catalog=dcp2&filters={"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}&format=compact'

… fails when following the redirect.

HTTP/1.1 500 Internal Server Error
…

Traceback (most recent call last):
  File "/var/task/chalice/app.py", line 1135, in _get_view_function_response
    response = view_function(**function_args)
  File "/var/task/app.py", line 1465, in start_manifest_generation_fetch
    wait_time, manifest = handle_manifest_generation_request()
  File "/var/task/app.py", line 1504, in handle_manifest_generation_request
    return async_service.start_or_inspect_manifest_generation(app.self_url(),
  File "/var/task/azul/service/async_manifest_service.py", line 92, in start_or_inspect_manifest_generation
    time_or_manifest = self._get_manifest_status(token['execution_id'], request_index)
  File "/var/task/azul/service/async_manifest_service.py", line 154, in _get_manifest_status
    raise StateMachineError(status, output)
azul.service.step_function_helper.StateMachineError: ('Failed to generate manifest', 'FAILED', None)

CloudWatch Logs Insights
region: us-east-1
log-group-names: /aws/lambda/azul-service-dev-manifest
start-time: 2020-12-18T23:23:10.000Z
end-time: 2020-12-18T23:37:18.000Z
query-string:

fields @timestamp, @message
| filter @requestId like "f178da11-417f-4ab1-8c23-0689ab0a9a89"
| sort @timestamp asc
| limit 45
@timestamp @message
2020-12-18 23:23:53.993 START RequestId: f178da11-417f-4ab1-8c23-0689ab0a9a89 Version: $LATEST
2020-12-18 23:23:54.007 [INFO] 2020-12-18T23:23:54.7Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Found credentials in environment variables.
2020-12-18 23:23:54.116 [INFO] 2020-12-18T23:23:54.116Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Cached manifest not found: manifests/f98bdf91-5607-5d20-983d-e84066c4c386.tsv
2020-12-18 23:23:54.162 [DEBUG] 2020-12-18T23:23:54.162Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Creating ES client [search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443]
2020-12-18 23:23:54.187 [INFO] 2020-12-18T23:23:54.186Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Found credentials in environment variables.
2020-12-18 23:23:54.323 [INFO] 2020-12-18T23:23:54.322Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/azul_v2_dev_dcp2_files_aggregate/_search?scroll=5m&size=1000 [status:200 request:0.125s]
2020-12-18 23:23:56.712 [INFO] 2020-12-18T23:23:56.712Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.125s]
2020-12-18 23:23:58.089 [INFO] 2020-12-18T23:23:58.88Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.166s]
2020-12-18 23:23:59.546 [INFO] 2020-12-18T23:23:59.546Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.124s]
2020-12-18 23:24:01.053 [INFO] 2020-12-18T23:24:01.53Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.164s]
2020-12-18 23:24:02.843 [INFO] 2020-12-18T23:24:02.843Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.301s]
2020-12-18 23:24:04.341 [INFO] 2020-12-18T23:24:04.341Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.132s]
2020-12-18 23:24:05.924 [INFO] 2020-12-18T23:24:05.923Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.191s]
2020-12-18 23:24:07.643 [INFO] 2020-12-18T23:24:07.642Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.333s]
2020-12-18 23:24:09.305 [INFO] 2020-12-18T23:24:09.305Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.218s]
2020-12-18 23:24:10.812 [INFO] 2020-12-18T23:24:10.812Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.140s]
2020-12-18 23:24:12.303 [INFO] 2020-12-18T23:24:12.303Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.117s]
2020-12-18 23:24:13.760 [INFO] 2020-12-18T23:24:13.759Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.127s]
2020-12-18 23:24:15.120 [INFO] 2020-12-18T23:24:15.119Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.128s]
2020-12-18 23:24:16.695 [INFO] 2020-12-18T23:24:16.694Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.129s]
2020-12-18 23:24:18.311 [INFO] 2020-12-18T23:24:18.310Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.268s]
2020-12-18 23:24:20.390 [INFO] 2020-12-18T23:24:20.390Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:200 request:0.167s]
2020-12-18 23:36:52.443 [WARNING] 2020-12-18T23:36:52.443Z f178da11-417f-4ab1-8c23-0689ab0a9a89 GET https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll?scroll=5m [status:404 request:0.071s]
2020-12-18 23:36:52.474 [INFO] 2020-12-18T23:36:52.474Z f178da11-417f-4ab1-8c23-0689ab0a9a89 DELETE https://search-azul-index-dev-atlcgwfjbk6dvrtmiaskbs2swa.us-east-1.es.amazonaws.com:443/_search/scroll [status:404 request:0.031s]
2020-12-18 23:36:52.478 [ERROR] 2020-12-18T23:36:52.474Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Upload hmLpgYeT3F3zrqXtXyBuZIk8xlRisVuIG0ZZW.X6qvOLghbhyc_8sGOyZiY8kqiocJuBU2o6S3dawiA6nZX74g6giq1FonC8ujYsRbreYdLSWN1PDdOW4CPqt6bh.5PwuJHX4MbKWgCGsYe303jXdEc6oa5Z.mybNqA47cWaMD4-: Error detected within the MPU context.
Traceback (most recent call last):
File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest
base_name = generator.write_to(text_buffer)
File "/var/task/azul/service/manifest_service.py", line 739, in write_to
for hit in self._create_request().scan():
File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan
for hit in scan(
File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan
resp = client.scroll(
File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped
return func(*args, params=params, **kwargs)
File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll
return self.transport.perform_request(
File "/opt/python/elasticsearch/transport.py", line 351, in perform_request
status, headers_response, data = connection.perform_request(
File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request
self._raise_error(response.status_code, raw_data)
File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100290778]')
2020-12-18 23:36:52.478 [INFO] 2020-12-18T23:36:52.478Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Upload hmLpgYeT3F3zrqXtXyBuZIk8xlRisVuIG0ZZW.X6qvOLghbhyc_8sGOyZiY8kqiocJuBU2o6S3dawiA6nZX74g6giq1FonC8ujYsRbreYdLSWN1PDdOW4CPqt6bh.5PwuJHX4MbKWgCGsYe303jXdEc6oa5Z.mybNqA47cWaMD4-: Aborting
2020-12-18 23:36:52.741 [WARNING] 2020-12-18T23:36:52.741Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Upload hmLpgYeT3F3zrqXtXyBuZIk8xlRisVuIG0ZZW.X6qvOLghbhyc_8sGOyZiY8kqiocJuBU2o6S3dawiA6nZX74g6giq1FonC8ujYsRbreYdLSWN1PDdOW4CPqt6bh.5PwuJHX4MbKWgCGsYe303jXdEc6oa5Z.mybNqA47cWaMD4-: Aborted
2020-12-18 23:36:52.745 END RequestId: f178da11-417f-4ab1-8c23-0689ab0a9a89
2020-12-18 23:36:52.745 REPORT RequestId: f178da11-417f-4ab1-8c23-0689ab0a9a89 Duration: 778746.24 ms Billed Duration: 778747 ms Memory Size: 1024 MB Max Memory Used: 214 MB Init Duration: 1861.86 ms
dsotirho-ucsc commented 3 years ago

The manifest generation failing with a broad filter (all file types) appears to be caused by exceeding the 15 minute timeout configured for manifest generation.

Unfortunately the logs surrounding the originally reported error are a bit hard to filter through due to multiple manifest requests being made within a short time. With further repeated tests however I was able to locate the logs of when the manifest generation was requested and when it failed and in these cases it was just over 15 minutes when the error occurred. (Note: These tests were for a "dcp1" catalog manifest because the manifest generation succeeded (failed to fail) for the "dcp2" catalog)

1608588399752   2020-12-21T14:06:39.752 dcp1 {"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}
1608589377781   2020-12-21T14:22:57.781 Caught exception for <function start_manifest_generation_fetch at 0x7f2b4f93c700>
=> 16.30 min

1608591592269   2020-12-21T14:59:52.269 dcp1 {"fileFormat":{"is":["bai","bam","csv","csv.gz","docx","fastq","fastq.gz","matrix","npy","npz","pdf","results","txt","unknown"]}}
1608592516730   2020-12-21T15:15:16.730 Caught exception for <function start_manifest_generation_fetch at 0x7f3e2eebe700>
=> 15.41 min

1608594253495   2020-12-21T15:44:13.495 dcp1 {"fileFormat":{"is":["bai","bam","csv","csv.gz","docx","fastq","fastq.gz","matrix","npy","npz","pdf","results","txt","unknown"]}}
1608595175786   2020-12-21T15:59:35.786 Caught exception for <function start_manifest_generation_fetch at 0x7f3509130700>
=> 15.37 min

1608653538573   2020-12-22T08:12:18.573 dcp1 {"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}
1608654451137   2020-12-22T08:27:31.137 Caught exception for <function start_manifest_generation_fetch at 0x7fac455ab700>
=> 15.21 min
dsotirho-ucsc commented 3 years ago

Further log combing found timeout messages for my attempts, however no such log message is found for the 2020-12-18 error in the ticket description.

1608588399752   2020-12-21T14:06:39.752 dcp1 {"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}
1608589307316   2020-12-21T14:21:47.316 Task timed out after 900.10 seconds
1608589377781   2020-12-21T14:22:57.781 Caught exception for <function start_manifest_generation_fetch at 0x7f2b4f93c700>

1608591592269   2020-12-21T14:59:52.269 dcp1 {"fileFormat":{"is":["bai","bam","csv","csv.gz","docx","fastq","fastq.gz","matrix","npy","npz","pdf","results","txt","unknown"]}}
1608591604904   2020-12-21T15:00:04.904 dcp1 {"fileFormat":{"is":["bai","bam","csv","csv.gz","docx","fastq","fastq.gz","matrix","npy","npz","pdf","results","txt","unknown"]}}
1608592507199   2020-12-21T15:15:07.199 Task timed out after 900.10 seconds
1608592516730   2020-12-21T15:15:16.730 Caught exception for <function start_manifest_generation_fetch at 0x7f3e2eebe700>
1608592519608   2020-12-21T15:15:19.608 Task timed out after 900.10 seconds

1608594253495   2020-12-21T15:44:13.495 dcp1 {"fileFormat":{"is":["bai","bam","csv","csv.gz","docx","fastq","fastq.gz","matrix","npy","npz","pdf","results","txt","unknown"]}}
1608595167358   2020-12-21T15:59:27.358 Task timed out after 900.08 seconds
1608595175786   2020-12-21T15:59:35.786 Caught exception for <function start_manifest_generation_fetch at 0x7f3509130700>

1608653538573   2020-12-22T08:12:18.573 dcp1 {"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}
1608654443409   2020-12-22T08:27:23.409 Task timed out after 900.09 seconds
1608654451137   2020-12-22T08:27:31.137 Caught exception for <function start_manifest_generation_fetch at 0x7fac455ab700>
melainalegaspi commented 3 years ago

Triage to discuss next steps.

melainalegaspi commented 3 years ago

@hannes-ucsc :" @danielsotirhos and I are clueless as to improve the hotfix any further. The hotfix truncates large entries arbitrarily, causing data loss."

theathorn commented 3 years ago

Spike to see if this still occurs.

amarjandu commented 3 years ago

Using the command provided in the description http 'https://service.dev.singlecell.gi.ucsc.edu/fetch/manifest/files?catalog=dcp2&filters={"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}&format=compact' the redirect was able to resolve to a file in around 10min.

amarjandu commented 3 years ago

Prod is also able to resolve a file when following the redirects http 'https://service.azul.data.humancellatlas.org/fetch/manifest/files?catalog=dcp10&filters={"fileFormat":{"is":["bam","csv","csv.gz","docx","fastq","h5","h5ad","loom","mtx.gz","pdf","Robj","tar","tar.gz","tsv","tsv.gz","txt","txt.gz","xlsx","zip"]}}&format=compact'

melainalegaspi commented 3 years ago

Search prod and dev logs for the NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id error to see if this still occurs.

amarjandu commented 3 years ago

The last time these messages were observed dev and prod was in December 2020

**CloudWatch Logs Insights**  
region: us-east-1  
log-group-names: /aws/lambda/azul-service-dev-manifest  
start-time: -31104000s  
end-time: 0s  
query-string:

fields @timestamp, @message |filter @message like "NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id" | sort @timestamp desc

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|       @timestamp        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               @message                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2020-12-18 23:36:52.743 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100290778]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                                                                                                                        |
| 2020-12-18 23:36:52.478 | [ERROR] 2020-12-18T23:36:52.474Z f178da11-417f-4ab1-8c23-0689ab0a9a89 Upload hmLpgYeT3F3zrqXtXyBuZIk8xlRisVuIG0ZZW.X6qvOLghbhyc_8sGOyZiY8kqiocJuBU2o6S3dawiA6nZX74g6giq1FonC8ujYsRbreYdLSWN1PDdOW4CPqt6bh.5PwuJHX4MbKWgCGsYe303jXdEc6oa5Z.mybNqA47cWaMD4-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100290778]')                                                                                           |
| 2020-12-18 23:28:09.154 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100283021]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                                                                                                                        |
| 2020-12-18 23:28:08.851 | [ERROR] 2020-12-18T23:28:08.848Z e51d25ac-a3a2-4b53-9543-0d420c684777 Upload 431szg4.uFG.oCg5Q6OdVEsXF4jv9gn8.V.SqctE0SPCIRJhRJ3WVVDlAsxb7mNvNI6zMWJm.1C0jgJB4fqGc574eiJZHgSAR.8YZ5Q9kQsM_SsYkKoqR3gsW7AoVIrTRDi.93okvuKpy3ZeNQ9k9Zq78T70rJm9m4jigm5Gv_g-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100283021]')                                                                                           |
| 2020-12-18 23:04:52.776 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100276658]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                                                                                                                        |
| 2020-12-18 23:04:52.493 | [ERROR] 2020-12-18T23:04:52.489Z 9719ca3b-c24b-4636-b904-3db096c987b8 Upload fctn46PQPUcytHaux.fd1SpJyjQ9fa1A7R8AcCTxB6Wh19dCqPN_2bf_YQ7cfiV7EsQZm2i4XY7C5DRZwfQuJxfFaVtmaxW.TNvM0blVJPV_mrbxdma6aj_X2igYG_l1X_Msxda.SLQE9sPi4G6Ml3oybExinkH3y3h9OwgiYy8-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100276658]')                                                                                           |
| 2020-12-16 16:52:40.261 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100186066]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                                                                                                                        |
| 2020-12-16 16:52:39.960 | [ERROR] 2020-12-16T16:52:39.957Z 14b9681e-4aec-4e68-89d5-deec46a82df9 Upload woWy5n_gncR8ZHuViaNc4b6gUSLMacDIKcHEJLD5n0e_HIX7l2uduZv4NmOupmYRq3MWoUBELnm01ru6d6qRrnRArTi4bU8tlgl0GdXMKTs79LAQbTpwqxqmkNk6_ydIGnuKY.k3aTlilbp90XfpmKxpT19FpOXuuMBoTTTTi1w-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [100186066]')                                                                                           |
| 2020-11-30 20:39:51.656 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [87948250]') Traceback (most recent call last):   File "/var/task/app.py", line 1534, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 257, in _generate_manifest     file_path, base_name = generator.create_file()   File "/var/task/azul/service/manifest_service.py", line 879, in create_file     self._samples_tsv(samples_tsv)   File "/var/task/azul/service/manifest_service.py", line 946, in _samples_tsv     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
**CloudWatch Logs Insights**  
region: us-east-1  
log-group-names: /aws/lambda/azul-service-prod-manifest  
start-time: -31104000s  
end-time: 0s  
query-string:

fields @timestamp, @message |filter @message like "NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id" | sort @timestamp desc

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|       @timestamp        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 @message                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2020-12-22 23:22:18.401 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [20896351]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                              |
| 2020-12-22 23:22:18.125 | [ERROR] 2020-12-22T23:22:18.122Z 982ffed4-7430-4288-b076-6014ccfe4315 Upload Ac32ADiAqcUEY61hGWmBJ9T0S_C1kPuoogCxsI9Nn5vDh8wKUVAF0F8fusR3WIhyxo..S_.DWE5FCBzbi9_kWDptKEDUNI31ZtG4nl386PDXb97x_KpJfJOmGQDXNWD6.PferQq57V3be5JYeIhNCc6qIy0kqxuoDFlTmRFEljU-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [20896351]') |
| 2020-12-22 20:23:37.309 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [20893347]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                              |
| 2020-12-22 20:23:36.947 | [ERROR] 2020-12-22T20:23:36.944Z b8cef799-deaa-4ab7-8d35-5925c3460680 Upload DLEde2vyBEyWxtE46PJIwSdyk4WZnr5ungnLxae5gYggmL860tQ9hkK.AbFODAWS.dK0WB54YeWXFE9t.yJia1TgKtxpcvv_muBHNwoaAEzMkCGM.pUtjRTWRnwgUXDBxfAEIJfr4wmWJz7Vz7b9V8F31CfPSliIAMxA6eRybHE-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [20893347]') |
| 2020-12-16 16:33:23.087 | [ERROR] NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [20803387]') Traceback (most recent call last):   File "/var/task/app.py", line 1532, in generate_manifest     manifest = service.get_manifest(format_=ManifestFormat(event['format']),   File "/var/task/azul/service/manifest_service.py", line 187, in get_manifest     file_name = self._generate_manifest(generator, object_key)   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)(                              |
| 2020-12-16 16:33:22.709 | [ERROR] 2020-12-16T16:33:22.704Z b2fe04fc-afe9-4eb6-b994-9a4f3870080c Upload xjeiVuVYDbmND3SzMaQOwZghAjtQ0vlPg70EDc4isd5ePJebH9unZeeYr.YzA3nT7iwzgyBnjKvxPwNaRK93WmheA6dCQl7C5KGAXdkXpCaPczs6dNURI3K0uzC3kfNgGomohnTTPrR5IATU_zS1IIHACvF.8lp7.oke0nPGTvY-: Error detected within the MPU context. Traceback (most recent call last):   File "/var/task/azul/service/manifest_service.py", line 279, in _generate_manifest     base_name = generator.write_to(text_buffer)   File "/var/task/azul/service/manifest_service.py", line 739, in write_to     for hit in self._create_request().scan():   File "/opt/python/elasticsearch_dsl/search.py", line 723, in scan     for hit in scan(   File "/opt/python/elasticsearch/helpers/actions.py", line 443, in scan     resp = client.scroll(   File "/opt/python/elasticsearch/client/utils.py", line 84, in _wrapped     return func(*args, params=params, **kwargs)   File "/opt/python/elasticsearch/client/__init__.py", line 1373, in scroll     return self.transport.perform_request(   File "/opt/python/elasticsearch/transport.py", line 351, in perform_request     status, headers_response, data = connection.perform_request(   File "/opt/python/elasticsearch/connection/http_requests.py", line 161, in perform_request     self._raise_error(response.status_code, raw_data)   File "/opt/python/elasticsearch/connection/base.py", line 229, in _raise_error     raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.NotFoundError: NotFoundError(404, 'search_phase_execution_exception', 'No search context found for id [20803387]') |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
melainalegaspi commented 3 years ago

@hannes-ucsc : "This doesn't appear to be an issue anymore."

nadove-ucsc commented 3 years ago

I don't think this should be closed while we have outstanding FIXME's for it https://github.com/DataBiosphere/azul/blob/2036eb064172410b7c047edf21e9235fa9343fa5/src/azul/service/manifest_service.py#L1321

melainalegaspi commented 3 years ago

@hannes-ucsc : "The FIXME @noah-aviel-dove is referring to precedes a workaround for this issue, not a permanent fix. The workaround arbitrarily selects the first 100 values. We need a better solution that is not lossy."

hannes-ucsc commented 2 years ago

We truncate fields in other places, too, so maybe the workaround IS a permanent solution. We need to consolidate the thresholds of truncation (#3725), reevaluate if the solution for #3248 makes the truncation in the manifest code redundant and remove the truncation and the FIXME or just the FIXME if the truncation in the manifest is still needed.