Closed prathapsridharan closed 8 months ago
@atarashansky has reported new load times in the as of 11/03/2023 in the following doc: https://docs.google.com/document/d/1w-XUG8AmioSYKekWrA1cSjyoGsAPKzE5c6V4C5gCm_U/edit#heading=h.8a1xw6x4n17t
stem cell
and ciliated epithelial cell
for brain
tissue into the dotplot - this gives about 40 genesgroup-by: Disease
23 seconds
(TileDB query time: 10 seconds
, Build Response Time: 9 seconds
) 35 MB
and is not compressed by cloudfront (Takes about 400 ms
to download the content with my fast internet connection - 800 Mbps)@atarashansky @joyceyan @kaloster @dsadgat
NOTE: Please view these as only suggestions. They are based on what were observed from the tests below and it is very likely more tests are needed to check the validity of these suggestions. As it pertains to TileDB, it is worth doing more profiling tests just for TileDB and perhaps asking for help from the TileDB team to understand its inner workings
The tests suggest that the queries on TileDB may have bottleneck in transporting large amounts of data from AWS S3 to the webserver making the query. Even if TileDB fetches exactly the final result set from S3, traversing an object store like S3 is quite likely to be slower than a linux filesystem. What seems likely is that TileDB might fetch a larger block of data from S3 and then do further filtering on the web server that made the query.
Embedded databases (other examples: sqlite, levelDB, indexDB, etc) are typically used for niche use cases like mobile applications, web browsers, and to shard high volume writes (and perhaps even reads). Typical web applications use a database server where the data and the database server are on the same machine. Actions like moving the data to the data processor or sharding the data would rarely be considered in database server model unless the workloads are truly large.
prod
because the more detailed datadog tracing hadn't been deployed to prod
at the time of testingPick all marker genes from "stem_cell" and "ciliated epithelial cell" for "brain" tissue - this gives 50 genes. Then group by "disease"
expression_summary
accounts for over 50% of the query time as noted in the previous commentIn this test, I repeated the following query several times in a short span to see the effects on _query for expressionsummary and rollup
Command line query used for testing: 50 genes and group by disease
curl -X 'POST' \
'https://api.cellxgene.staging.single-cell.czi.technology/wmg/v2/query' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"compare":"disease","filter":{"dataset_ids":[],"development_stage_ontology_term_ids":[],"disease_ontology_term_ids":[],"gene_ontology_term_ids":["ENSG00000188596","ENSG00000255794","ENSG00000130413","ENSG00000118997","ENSG00000165309","ENSG00000115423","ENSG00000197748","ENSG00000105877","ENSG00000186952","ENSG00000114670","ENSG00000186409","ENSG00000155761","ENSG00000152582","ENSG00000170959","ENSG00000157423","ENSG00000168038","ENSG00000174776","ENSG00000107249","ENSG00000253438","ENSG00000206530","ENSG00000133640","ENSG00000080298","ENSG00000162814","ENSG00000250305","ENSG00000156042","ENSG00000164692","ENSG00000168542","ENSG00000011465","ENSG00000108821","ENSG00000106809","ENSG00000139329","ENSG00000129009","ENSG00000113140","ENSG00000105894","ENSG00000248527","ENSG00000114115","ENSG00000100097","ENSG00000152661","ENSG00000234741","ENSG00000245910","ENSG00000087303","ENSG00000237550","ENSG00000175061","ENSG00000116774","ENSG00000189058","ENSG00000038427","ENSG00000198856","ENSG00000185222","ENSG00000203875","ENSG00000118418"],"organism_ontology_term_id":"NCBITaxon:9606","self_reported_ethnicity_ontology_term_ids":[],"sex_ontology_term_ids":[],"publication_citations":[]},"is_rollup":true}'
query to expression_summary produces a result containing ~400,000 rows
- That is a lot of rows to fetch!/query
endpoint in the average caseThe lower bound observed for the _query time of expressionsummary is around 4s
and the upper bound is around 19s
. The lower bound seems too high for the caching to be entirely in memory and therefore likely due to local disk caching. The upper bound of 19s
seems to indicate that this is primarily due to network I/O and data (de)serialization to/from AWS S3
expression_summary_default_cube
and expression_summary_cube
by not including the compare dimension (group by) in the query. Therefore the query only uses indexed dimensions and appends only numerical attributes (attributes are non-indexed). To do this, I modified the \query
code to query both expression_summary_cube
and expression_summary_default_cube
for the below query criteria and checked the execution times reported by datadogrdev
backend swagger UI to send the following request body to the wmg/v2/query
endpoint: 50 genes queried WITHOUT grouping by disease:{"filter":{"dataset_ids":[],"development_stage_ontology_term_ids":[],"disease_ontology_term_ids":[],"gene_ontology_term_ids":["ENSG00000188596","ENSG00000255794","ENSG00000130413","ENSG00000118997","ENSG00000165309","ENSG00000115423","ENSG00000197748","ENSG00000105877","ENSG00000186952","ENSG00000114670","ENSG00000186409","ENSG00000155761","ENSG00000152582","ENSG00000170959","ENSG00000157423","ENSG00000168038","ENSG00000174776","ENSG00000107249","ENSG00000253438","ENSG00000206530","ENSG00000133640","ENSG00000080298","ENSG00000162814","ENSG00000250305","ENSG00000156042","ENSG00000164692","ENSG00000168542","ENSG00000011465","ENSG00000108821","ENSG00000106809","ENSG00000139329","ENSG00000129009","ENSG00000113140","ENSG00000105894","ENSG00000248527","ENSG00000114115","ENSG00000100097","ENSG00000152661","ENSG00000234741","ENSG00000245910","ENSG00000087303","ENSG00000237550","ENSG00000175061","ENSG00000116774","ENSG00000189058","ENSG00000038427","ENSG00000198856","ENSG00000185222","ENSG00000203875","ENSG00000118418"],"organism_ontology_term_id":"NCBITaxon:9606","self_reported_ethnicity_ontology_term_ids":[],"sex_ontology_term_ids":[],"publication_citations":[]},"is_rollup":true}
expression_summary_cube
with and without group-by selection AND with all the cube data resident on local diskv2_query_50_genes_compare_disease = '{"compare":"disease","filter":{"dataset_ids":[],"development_stage_ontology_term_ids":[],"disease_ontology_term_ids":[],"gene_ontology_term_ids":["ENSG00000188596","ENSG00000255794","ENSG00000130413","ENSG00000118997","ENSG00000165309","ENSG00000115423","ENSG00000197748","ENSG00000105877","ENSG00000186952","ENSG00000114670","ENSG00000186409","ENSG00000155761","ENSG00000152582","ENSG00000170959","ENSG00000157423","ENSG00000168038","ENSG00000174776","ENSG00000107249","ENSG00000253438","ENSG00000206530","ENSG00000133640","ENSG00000080298","ENSG00000162814","ENSG00000250305","ENSG00000156042","ENSG00000164692","ENSG00000168542","ENSG00000011465","ENSG00000108821","ENSG00000106809","ENSG00000139329","ENSG00000129009","ENSG00000113140","ENSG00000105894","ENSG00000248527","ENSG00000114115","ENSG00000100097","ENSG00000152661","ENSG00000234741","ENSG00000245910","ENSG00000087303","ENSG00000237550","ENSG00000175061","ENSG00000116774","ENSG00000189058","ENSG00000038427","ENSG00000198856","ENSG00000185222","ENSG00000203875","ENSG00000118418"],"organism_ontology_term_id":"NCBITaxon:9606","self_reported_ethnicity_ontology_term_ids":[],"sex_ontology_term_ids":[],"publication_citations":[]},"is_rollup":true}'
200 trials
total_query_time = 0
n = 200
for _ in range(n):
start = time.perf_counter()
df = q.expression_summary(criteria, compare_dimension=compare)
end = time.perf_counter()
total_query_time += (end - start)
print(f"total: {total_query_time}, avg: {total_query_time/n}")
total: 44s, avg: 0.22s
total: 39s, avg: 0.19s
expression_summary_cube
when the TileDB data is on local disk is 50-100X faster than runtimes seen on STAGINGEven accounting for the fact that my local M1 might be significantly more powerful than the web server, CPU and memory advantages alone cannot fully account for a speed up of 2 orders of magnitude. Network I/O and (de)serialization are not bottlenecks on the local machine whereas those are likely to be bottlenecks on our AWS infra.
expression_summary_cube
vs expression_summary_default_cube
when TileDB data is on local disk. This test is similar to the test on RDEV except here the TileDB data is on local disk.v2_query_50_genes_no_compare = '{"filter":{"dataset_ids":[],"development_stage_ontology_term_ids":[],"disease_ontology_term_ids":[],"gene_ontology_term_ids":["ENSG00000188596","ENSG00000255794","ENSG00000130413","ENSG00000118997","ENSG00000165309","ENSG00000115423","ENSG00000197748","ENSG00000105877","ENSG00000186952","ENSG00000114670","ENSG00000186409","ENSG00000155761","ENSG00000152582","ENSG00000170959","ENSG00000157423","ENSG00000168038","ENSG00000174776","ENSG00000107249","ENSG00000253438","ENSG00000206530","ENSG00000133640","ENSG00000080298","ENSG00000162814","ENSG00000250305","ENSG00000156042","ENSG00000164692","ENSG00000168542","ENSG00000011465","ENSG00000108821","ENSG00000106809","ENSG00000139329","ENSG00000129009","ENSG00000113140","ENSG00000105894","ENSG00000248527","ENSG00000114115","ENSG00000100097","ENSG00000152661","ENSG00000234741","ENSG00000245910","ENSG00000087303","ENSG00000237550","ENSG00000175061","ENSG00000116774","ENSG00000189058","ENSG00000038427","ENSG00000198856","ENSG00000185222","ENSG00000203875","ENSG00000118418"],"organism_ontology_term_id":"NCBITaxon:9606","self_reported_ethnicity_ontology_term_ids":[],"sex_ontology_term_ids":[],"publication_citations":[]},"is_rollup":true}'
200 trials
total_query_time = 0
n = 200
for _ in range(n):
start = time.perf_counter()
df = q.expression_summary_default(criteria)
end = time.perf_counter()
total_query_time += (end - start)
print(f"total: {total_query_time}, avg: {total_query_time/n}")
total: 20s, avg: 0.10s
expression_summary_default_cube
when the TileDB data is on local disk is 30-60X faster than runtimes seen on STAGING and RDEVexpression_summary_default_cube
is 2X faster than query time for expression_summary_cube
for the same criteria. This is similar query time ratio seen in the RDEV test.Having the TileDB array on local disk significantly speeds up queries for the same reason given in the expression_summary_cube
query test above
Similar to the RDEV test, the same query criteria executes twice as fast when querying expression_summary_default_cube
than querying expression_summary_cube
because significantly fewer rows are scanned in expression_summary_default_cube
As a follow up to the LOCAL MACHINE TESTING, I ran 200 trials
of querying expression_summary_cube
by generating a query of 50 random genes in each trial with grouping by disease. The average runtime i still about the same, ~0.20 seconds
. I did randomized the query by genes to mitigate effects of any optimizations due to making the same query in rapid succession:
import random
import time
unique_genes = list(expr_default_cube.gene_ontology_term_id.unique())
def get_random_query_criteria():
random_genes = random.sample(unique_genes, k=50)
query = {
"compare":"disease",
"filter":{
"dataset_ids":[],
"development_stage_ontology_term_ids":[],
"disease_ontology_term_ids":[],
"gene_ontology_term_ids": random_genes,
"organism_ontology_term_id":"NCBITaxon:9606",
"self_reported_ethnicity_ontology_term_ids":[],
"sex_ontology_term_ids":[],
"publication_citations":[]
},
"is_rollup":True}
is_rollup = query.get("is_rollup", True)
compare = query.get("compare", None)
if compare:
compare = find_dimension_id_from_compare(compare)
criteria = WmgQueryCriteriaV2(**req["filter"])
return criteria
def run_trial_query(query_func, n=200):
total_query_time = 0
for _ in range(n):
criteria = get_random_query_criteria()
start = time.perf_counter()
df = query_func(criteria)
end = time.perf_counter()
total_query_time += (end - start)
return total_query_time, total_query_time/n
total, avg = run_trial_query(q.expression_summary, n=200)
print(f"total_time: {total} seconds, avg: {avg} seconds")
total_time: 39.24072681547841 seconds, avg: 0.19620363407739205 seconds
Another suggestion I just thought of is that it could be better to not send off any backend requests until a user has selected all the filters they want to select, and then click some "Submit" button. Right now, my understanding is that touching any of the filters will send off a new request to WMG. So if you want to select 5 different genes and then also group by disease, we'd send off six requests to WMG (one for each of the 5 genes, and another for the group by). If you can pick the filters and then click a "Submit" button before sending off a request to WMG, we'd only have to send one request to WMG. This type of product change obviously wouldn't result in any of our endpoints being faster, but would help the problem of users spending a lot of time waiting for things to load.
@joyceyan -
I think collecting the criteria first and hitting some type of submit button is a good idea to consider. It is a product decision in terms of how the application feels when you do that but assuming that is acceptable, something like that does reduce the total number of requests to the backend.
In the current case of picking genes though, if you pick 5 genes, it cancels the previous request-in-flight (when possible) as you click each gene. So in most cases, picking 5 genes yields 1 request to the backend but is that is not always the case since a delay between selection could have sent the previous request off and it would be too late to cancel. Moreover, there is scope for a race condition bug here: A request with 4 genes gets sent to backend, then adding 1 more gene sends a request with 5 genes, but the request with 5 genes gets back faster than the 4 gene request. The 4 gene request then clobbers the 5 gene request thus creating a buggy experience.
Oh I see - so it's not as bad as I thought, but sending more requests than we need (even if they get cancelled) is still strictly worse.
@atarashansky had a commit to copy the tiledb data files to the local fs but reverted it the very next day
I think it might be worth revisiting this.
In my rdev
I performed a test of selecting 50 genes - adding to dotplot all marker genes for stem cell and ciliated epithelial cell
and group-by disease
with cProfile
. The tileDB runtime is 14.77 s.
It seems _98% of the time is spent on run_query() method of multirange_indexing.py
of tileDB when querying expression_summary_cube
_:
2999 function calls (2978 primitive calls) in 14.585 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 14.585 14.585 tracer.py:952(func_wrapper)
1 0.000 0.000 14.585 14.585 query.py:87(expression_summary)
1 0.004 0.004 14.585 14.585 query.py:121(_query)
1 0.003 0.003 14.578 14.578 _decorators.py:323(wrapper)
1 0.000 0.000 14.576 14.576 concat.py:146(concat)
1 0.000 0.000 14.570 14.570 concat.py:389(__init__)
2 0.000 0.000 14.570 7.285 multirange_indexing.py:288(__iter__)
1 14.537 14.537 14.570 14.570 multirange_indexing.py:393(_run_query)
1 0.000 0.000 0.033 0.033 {method 'to_pandas' of 'pyarrow.lib._PandasConvertible' objects}
1 0.000 0.000 0.032 0.032 pandas_compat.py:797(table_to_blockmanager)
1 0.000 0.000 0.031 0.031 pandas_compat.py:1163(_table_to_blocks)
1 0.031 0.031 0.031 0.031 {pyarrow.lib.table_to_blocks}
1 0.000 0.000 0.006 0.006 concat.py:565(get_result)
1 0.000 0.000 0.006 0.006 concat.py:176(concatenate_managers)
3 0.006 0.002 0.006 0.002 {method 'copy' of 'numpy.ndarray' objects}
And _run_query()
is the central function in DataFrameIndexer
class to implement the .df[]
indexing to directly return a dataframe. See here: https://github.com/TileDB-Inc/TileDB-Py/blob/24d5b3bb0869ef4d2f0d0c64c494bed037754949/tiledb/multirange_indexing.py#L394
After performing a tiledb query, we convert the result to a dataframe using .df[]
, and so _it is this conversion to a dataframe that seems to be the problem rather than query speed of TileDB or network I/O as originally thought. Moreover, this is true even for querying expression_summary_default_cube
AND expression_summary_cube
and for other use cases like using the secondary filter to filter the dotplot by indexed or non-indexed dimensions of the cube_
Here is an example of using tiledb.stats_enable()
to profile querying the expression_summary cube
with 50 genes and group-by disease
:
Server time measured by server_timing
module: 13.80 seconds
Tiledb Stats dumps the following json object that suggest
{'counters': {'Context.StorageManager.Query.Reader.attr_fixed_num': 78,
'Context.StorageManager.Query.Reader.attr_num': 110,
'Context.StorageManager.Query.Reader.attr_var_num': 32,
'Context.StorageManager.Query.Reader.cell_num': 1190000,
'Context.StorageManager.Query.Reader.dim_num': 52,
'Context.StorageManager.Query.Reader.dim_var_num': 52,
'Context.StorageManager.Query.Reader.ignored_tiles': 2,
'Context.StorageManager.Query.Reader.internal_loop_num': 26,
'Context.StorageManager.Query.Reader.loop_num': 26,
'Context.StorageManager.Query.Reader.num_tiles_read': 938,
'Context.StorageManager.Query.Reader.read_unfiltered_byte_num': 96274068,
'Context.StorageManager.Query.Reader.result_num': 501699,
'Context.StorageManager.VFS.file_size_num': 1,
'Context.StorageManager.VFS.read_byte_num': 726522977,
'Context.StorageManager.VFS.read_ops_num': 272,
'Context.StorageManager.read_array_meta_size': 0,
'Context.StorageManager.read_array_schema_size': 960,
'Context.StorageManager.read_frag_meta_size': 1584,
'Context.StorageManager.read_rtree_size': 3171014,
'Context.StorageManager.read_tile_offsets_size': 1865248,
'Context.StorageManager.read_tile_var_offsets_size': 1356544,
'Context.StorageManager.read_tile_var_sizes_size': 1356544,
'Context.StorageManager.read_unfiltered_byte_num': 7750310},
'python': '{"py.buffer_conversion_time": 0.0004088878631591797, '
'"py.core_read_query_initial_submit_time": 13.619953776, '
'"py.core_read_query_total_time": 13.62031456, "py.getitem_time": '
'0.0007717609405517578, "py.getitem_time.add_ranges": '
'0.0001685619354248047, "py.pandas_index_update_time": '
'3.647804260253906e-05, "py.query_retries_count": 0.0}',
'timers': {'Context.StorageManager.Query.Reader.apply_query_condition.avg': 0.000182243,
'Context.StorageManager.Query.Reader.apply_query_condition.sum': 0.00182243,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.avg': 7.56013e-05,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.sum': 0.00884535,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.avg': 0.000581931,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.sum': 0.00581931,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.avg': 0.000327327,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.sum': 0.00981982,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.avg': 0.000786381,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.sum': 0.0275233,
'Context.StorageManager.Query.Reader.copy_var_tiles.avg': 0.000592393,
'Context.StorageManager.Query.Reader.copy_var_tiles.sum': 0.0207338,
'Context.StorageManager.Query.Reader.create_result_tiles.avg': 1.50168e-05,
'Context.StorageManager.Query.Reader.create_result_tiles.sum': 0.000390437,
'Context.StorageManager.Query.Reader.dowork.avg': 0.70863,
'Context.StorageManager.Query.Reader.dowork.sum': 18.4244,
'Context.StorageManager.Query.Reader.load_initial_data.avg': 0.0130393,
'Context.StorageManager.Query.Reader.load_initial_data.sum': 0.339021,
'Context.StorageManager.Query.Reader.load_tile_offsets.avg': 0.00609423,
'Context.StorageManager.Query.Reader.load_tile_offsets.sum': 0.47535,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.avg': 0.0106503,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.sum': 0.276908,
'Context.StorageManager.Query.Reader.process_tiles.avg': 1.44423,
'Context.StorageManager.Query.Reader.process_tiles.sum': 14.4423,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.avg': 1.43544,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.sum': 14.3544,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.avg': 0.288153,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.sum': 2.88153,
'Context.StorageManager.Query.Reader.read_attribute_tiles.avg': 0.7477,
'Context.StorageManager.Query.Reader.read_attribute_tiles.sum': 14.954,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.avg': 0.108101,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.sum': 2.16202,
'Context.StorageManager.Query.Reader.read_tiles.avg': 0.427897,
'Context.StorageManager.Query.Reader.read_tiles.sum': 17.1159,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.avg': 5.1209e-06,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.sum': 5.1209e-05,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.avg': 3.96864e-05,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.sum': 0.00138902,
'Context.StorageManager.Query.Reader.tile_offset_sizes.avg': 3.68069e-06,
'Context.StorageManager.Query.Reader.tile_offset_sizes.sum': 9.5698e-05,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.avg': 0.00161202,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.sum': 0.0822132,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.avg': 0.000742896,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.sum': 0.0371448,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.avg': 0.011372,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.sum': 0.011372,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.avg': 0.0841105,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.sum': 0.0841105,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.avg': 0.0754825,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.sum': 0.0754825,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.avg': 0.0677259,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.sum': 0.0677259,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.avg': 0.0158559,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.sum': 0.0158559,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.avg': 1.904e-06,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.sum': 1.904e-06,
'Context.StorageManager.array_open_READ.avg': 0.443385,
'Context.StorageManager.array_open_READ.sum': 0.443385,
'Context.StorageManager.array_open_read_load_directory.avg': 0.0914446,
'Context.StorageManager.array_open_read_load_directory.sum': 0.0914446,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.avg': 0.35193,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.sum': 0.35193,
'Context.StorageManager.sm_load_all_array_schemas.avg': 0.0262434,
'Context.StorageManager.sm_load_all_array_schemas.sum': 0.0262434,
'Context.StorageManager.sm_load_array_metadata.avg': 5.957e-06,
'Context.StorageManager.sm_load_array_metadata.sum': 5.957e-06,
'Context.StorageManager.sm_load_array_schema_from_uri.avg': 0.0262303,
'Context.StorageManager.sm_load_array_schema_from_uri.sum': 0.0262303,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.avg': 0.351926,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.sum': 0.351926,
'Context.StorageManager.sm_load_filtered_fragment_uris.avg': 3.4826e-05,
'Context.StorageManager.sm_load_filtered_fragment_uris.sum': 3.4826e-05,
'Context.StorageManager.sm_load_fragment_metadata.avg': 0.325637,
'Context.StorageManager.sm_load_fragment_metadata.sum': 0.325637}}
I have also used cProfile
to count non-cpu-time (IO, process pre-empted by OS scheduler, memory swap, etc) and found that about 60-80% of the time is spent does not involve the cpu.
For example, in the test case of picking 50 genes and group-by disease
operation that queries expression_summary cube
, the total reported time is 13.50 seconds and the non-cpu time is about 10.50 seconds.
The functions I wrote that enable profiling total process time or non-cpu-time is here (cleaned up a little for readability):
def not_cpu_time():
times = os.times()
return times.elapsed - (times.system + times.user)
def cprofile_query(*, q, criteria, compare=None, measure_io=False):
profiler = Profile(not_cpu_time) if measure_io else Profile()
expression_summary = profiler.runcall(q.expression_summary, criteria, compare_dimension=compare)
s = io.StringIO()
stats = Stats(profiler, stream=s)
stats.strip_dirs()
stats.sort_stats("cumulative")
stats.print_stats()
logger.info(f"Profiling expression_summary with measure_io flag set to: {measure_io}:\n{s.getvalue()}")
return expression_summary
# to profile non-cpu-time call it with measure_io=True:
cprofile_query(q=q, criteria=criteria, compare=compare, measure_io=True)
# to profile full process time (cpu and non-cpu-time included), call with measure_io=False:
cprofile_query(q=q, criteria=criteria, compare=compare, measure_io=False)
m5.xlarge
(4 vcpus and 16 GB ram)select 50 genes and select group by: disease
dataframeIndexer
to a dataframe
object. This is 1000X faster: 0.001 seconds!I could not get this test case to work because context needed to setup was quite a bit more complicated
I have a command line script here that does profiling in several ways
Steps:
single-cell-data-portal
repo$ cd single-cell-data-portal
staging-snapshot
in the single-cell-data-portal
folderProfiler Script Usage:
# simple wallclock profiling calling the function the web app would call to execute the query
$ python3 profile_expr_summary.py --data-location local --profile-type simple-wallclock --profile-scope web-app-query
# simple wallclock profiling excusively calling tiledb query method
$ python3 profile_expr_summary.py --data-location local --profile-type simple-wallclock --profile-scope raw-query
# Collect tiledb stats for web app query
$ python3 profile_expr_summary.py --data-location local --profile-type tiledb-stats --profile-scope web-app-query
# Collect tiledb stats for raw query
$ python3 profile_expr_summary.py --data-location local --profile-type tiledb-stats --profile-scope raw-query
# simple wallclock profiling of raw query by pulling snapshot from s3 - THIS CURRENTLY DOES NOT WORK
$ python3 profile_expr_summary.py --data-location s3 --profile-type tiledb-stats --profile-scope raw-query
# The command I ran
$ python3 profile_expr_summary.py --data-location local --profile-type tiledb-stats --profile-scope web-app-query
{'counters': {'Context.StorageManager.Query.Reader.attr_fixed_num': 24,
'Context.StorageManager.Query.Reader.attr_num': 33,
'Context.StorageManager.Query.Reader.attr_var_num': 9,
'Context.StorageManager.Query.Reader.cell_num': 950000,
'Context.StorageManager.Query.Reader.dim_num': 23,
'Context.StorageManager.Query.Reader.dim_var_num': 23,
'Context.StorageManager.Query.Reader.ignored_tiles': 1,
'Context.StorageManager.Query.Reader.internal_loop_num': 8,
'Context.StorageManager.Query.Reader.loop_num': 8,
'Context.StorageManager.Query.Reader.num_tiles_read': 57546,
'Context.StorageManager.Query.Reader.read_unfiltered_byte_num': 5712173514,
'Context.StorageManager.Query.Reader.result_num': 42915399,
'Context.StorageManager.VFS.file_size_num': 2,
'Context.StorageManager.VFS.read_byte_num': 1437612444,
'Context.StorageManager.VFS.read_ops_num': 255,
'Context.StorageManager.read_array_meta_size': 0,
'Context.StorageManager.read_array_schema_size': 1480,
'Context.StorageManager.read_frag_meta_size': 2552,
'Context.StorageManager.read_rtree_size': 3171014,
'Context.StorageManager.read_tile_offsets_size': 1594656,
'Context.StorageManager.read_tile_var_offsets_size': 983904,
'Context.StorageManager.read_tile_var_sizes_size': 983904,
'Context.StorageManager.read_unfiltered_byte_num': 6734958,
'Context.StorageManager.subSubarray.add_range_dim_0': 50,
'Context.StorageManager.subSubarray.add_range_dim_2': 1},
'python': '{"py.buffer_conversion_time": 0.0003116130828857422, '
'"py.core_read_query_initial_submit_time": 0.593130085, '
'"py.core_read_query_total_time": 0.593898498, "py.getitem_time": '
'0.0007047653198242188, "py.getitem_time.add_ranges": '
'0.00018072128295898438, "py.pandas_index_update_time": '
'4.744529724121094e-05, "py.query_retries_count": 0.0}',
'timers': {'Context.StorageManager.Query.Reader.apply_query_condition.avg': 1.5e-07,
'Context.StorageManager.Query.Reader.apply_query_condition.sum': 3e-07,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.avg': 7.83403e-05,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.sum': 0.0072073,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.avg': 0.00123238,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.sum': 0.00246476,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.avg': 0.00319179,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.sum': 0.076603,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.avg': 0.0164023,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.sum': 0.524874,
'Context.StorageManager.Query.Reader.copy_var_tiles.avg': 0.00946177,
'Context.StorageManager.Query.Reader.copy_var_tiles.sum': 0.302777,
'Context.StorageManager.Query.Reader.create_result_tiles.avg': 0.00479548,
'Context.StorageManager.Query.Reader.create_result_tiles.sum': 0.00959097,
'Context.StorageManager.Query.Reader.dowork.avg': 0.893305,
'Context.StorageManager.Query.Reader.dowork.sum': 7.14644,
'Context.StorageManager.Query.Reader.load_initial_data.avg': 0.00761377,
'Context.StorageManager.Query.Reader.load_initial_data.sum': 0.0152275,
'Context.StorageManager.Query.Reader.load_tile_offsets.avg': 0.0016087,
'Context.StorageManager.Query.Reader.load_tile_offsets.sum': 0.00965217,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.avg': 0.00110676,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.sum': 0.00221352,
'Context.StorageManager.Query.Reader.process_tiles.avg': 0.883828,
'Context.StorageManager.Query.Reader.process_tiles.sum': 7.07062,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.avg': 0.627374,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.sum': 5.01899,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.avg': 0.0150342,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.sum': 0.0300684,
'Context.StorageManager.Query.Reader.read_attribute_tiles.avg': 0.120051,
'Context.StorageManager.Query.Reader.read_attribute_tiles.sum': 1.20051,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.avg': 0.00525141,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.sum': 0.0105028,
'Context.StorageManager.Query.Reader.read_tiles.avg': 0.100915,
'Context.StorageManager.Query.Reader.read_tiles.sum': 1.21098,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.avg': 2.22929e-05,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.sum': 0.000178343,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.avg': 0.00631311,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.sum': 0.202019,
'Context.StorageManager.Query.Reader.tile_offset_sizes.avg': 7.6905e-06,
'Context.StorageManager.Query.Reader.tile_offset_sizes.sum': 1.5381e-05,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.avg': 0.0634444,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.sum': 2.09366,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.avg': 0.0639595,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.sum': 1.72691,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.avg': 9.882e-06,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.sum': 1.9764e-05,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.avg': 5.9662e-05,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.sum': 0.000119324,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.avg': 8.21e-05,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.sum': 0.0001642,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.avg': 1.25745e-05,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.sum': 2.5149e-05,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.avg': 7.61865e-05,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.sum': 0.000152373,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.avg': 1.0425e-06,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.sum': 2.085e-06,
'Context.StorageManager.array_open_READ.avg': 0.000473925,
'Context.StorageManager.array_open_READ.sum': 0.00094785,
'Context.StorageManager.array_open_read_load_directory.avg': 0.000242743,
'Context.StorageManager.array_open_read_load_directory.sum': 0.000485486,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.avg': 0.000222092,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.sum': 0.000444183,
'Context.StorageManager.sm_load_all_array_schemas.avg': 0.00012188,
'Context.StorageManager.sm_load_all_array_schemas.sum': 0.000243761,
'Context.StorageManager.sm_load_array_metadata.avg': 6.674e-06,
'Context.StorageManager.sm_load_array_metadata.sum': 1.3348e-05,
'Context.StorageManager.sm_load_array_schema_from_uri.avg': 0.000114239,
'Context.StorageManager.sm_load_array_schema_from_uri.sum': 0.000228479,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.avg': 0.000220344,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.sum': 0.000440687,
'Context.StorageManager.sm_load_filtered_fragment_uris.avg': 2.35015e-05,
'Context.StorageManager.sm_load_filtered_fragment_uris.sum': 4.7003e-05,
'Context.StorageManager.sm_load_fragment_metadata.avg': 6.96155e-05,
'Context.StorageManager.sm_load_fragment_metadata.sum': 0.000139231,
'Context.StorageManager.subSubarray.compute_relevant_frags.avg': 5.881e-05,
'Context.StorageManager.subSubarray.compute_relevant_frags.sum': 5.881e-05,
'Context.StorageManager.subSubarray.read_compute_simple_tile_overlap.avg': 0.0151931,
'Context.StorageManager.subSubarray.read_compute_simple_tile_overlap.sum': 0.0151931,
'Context.StorageManager.subSubarray.read_load_relevant_rtrees.avg': 0.0147723,
'Context.StorageManager.subSubarray.read_load_relevant_rtrees.sum': 0.0147723,
'Context.StorageManager.subSubarray.sort_ranges.avg': 9.5146e-05,
'Context.StorageManager.subSubarray.sort_ranges.sum': 0.000190292}}
Running the following single tiledb query script, I noticed a couple of interesting things things:
NOTE: This script runs a single query exactly once. The query is hardcoded. There is no randomization of any kind.
This is the closest approximation to a resource allocation for our prod
environment. In prod
, each container gets 4 vCPU and 30 GB of RAM
. There are no instance types with exactly this spec so settled on the conservative side.
Querying from S3 can be as much as 10X slower than querying from local disk (see the following points)
The first 2 queries to S3 are extremely slow, on the order of the runtimes we see in the web application. But after that the same query is pretty fast. This might support the hypothesis that I/O is the culprit for the slow runtimes and subsequent runs hit a cache and therefore significantly faster. The caching might be done by the S3 service to serve up recently queried files:
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 9.871938740951009
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 6.401317582000047
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 3.399237139034085
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 2.3840295899426565
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 2.479845870984718
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 2.3942137280246243
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 7.943302907049656
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 7.340425575966947
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location local --profile-type simple-wallclock
Query Time: 4.589273056015372
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location local --profile-type simple-wallclock
Query Time: 0.6380285220220685
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location local --profile-type simple-wallclock
Query Time: 0.6346769869560376
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 5.885547672049142
(venv) ssm-user@ip-172-31-13-49:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location local --profile-type simple-wallclock
Query Time: 0.6382372019579634
This is the closest approximation to a resource allocation for our rdev
environment. In rdev
, each container gets 2 vCPU and 8 GB of RAM
.
m5.large EC2 instance
than the more powerful m5.xlarge EC2 instance
(~ 7-9 seconds). But local disk query time between instances remain fast and doesn't vary much. Therefore, querying from S3 with less powerful machines makes the runtimes significantly slower. This has cost implications: By moving the data to local disk we give ourselves the option to run on less powerful machines because tiledb queries will remain fast.(venv) ssm-user@ip-172-31-24-15:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 13.380075528000361
(venv) ssm-user@ip-172-31-24-15:~/single-cell-data-portal$ python3 profile_single_query_s3.py --data-location local --profile-type simple-wallclock
Query Time: 0.7477878340000643
@prathapsridharan Is the first query to the local snapshot always slow? Subsequent queries are fast if you use the same query as the first one, but what if you change the query each time? Is it slow each time? If that's the case, then I think a more relevant performance comparison would be looking at the first invocation of a unique query.
@atarashansky - The slowness on first query of local disk snapshot is likely due to the OS not having had the chance to warm up the filesystem cache. To illustrate that point, I ran 2 tests for local disk scenario. I even included a sleep of 30 seconds between runs to check if that might make a surface any deviant query time - It doesn't:
$ python3 profile_expr_summary.py --data-location local --profile-type simple-wallclock --profile-scope raw-query --random True --num-trials 20 --wait 30
#####Simple Wallclock Profiling RAW QUERY#####
num_trials: 20, num_genes_per_trial: 50, randomize_queries: True, wait_time: 30.0
trial 0: query_time: 0.3260967959649861
trial 1: query_time: 0.28244731796439737
trial 2: query_time: 0.2792491050204262
trial 3: query_time: 0.28724771400447935
trial 4: query_time: 0.2647852109512314
trial 5: query_time: 0.26117483305279166
trial 6: query_time: 0.3354923519073054
trial 7: query_time: 0.3096563830040395
trial 8: query_time: 0.2736107819946483
trial 9: query_time: 0.31849775300361216
trial 10: query_time: 0.17899379297159612
trial 11: query_time: 0.1869285280117765
trial 12: query_time: 0.3909081279998645
trial 13: query_time: 0.2754709960427135
trial 14: query_time: 0.24768701498396695
trial 15: query_time: 0.34317331202328205
trial 16: query_time: 0.3123967950232327
trial 17: query_time: 0.22704013204202056
trial 18: query_time: 0.23085670894943178
trial 19: query_time: 0.25487172696739435
$ python3 profile_expr_summary.py --data-location local --profile-type simple-wallclock --profile-scope raw-query --num-trials 10 --wait 30
#####Simple Wallclock Profiling RAW QUERY#####
num_trials: 10, num_genes_per_trial: 50, randomize_queries: False, wait_time: 30.0
trial 0: query_time: 0.6219111330574378
trial 1: query_time: 0.597706374945119
trial 2: query_time: 0.6027412810362875
trial 3: query_time: 0.5878630029037595
trial 4: query_time: 0.6020356669323519
trial 5: query_time: 0.6042859819717705
trial 6: query_time: 0.5935844500781968
trial 7: query_time: 0.6033124219393358
trial 8: query_time: 0.6097885479684919
trial 9: query_time: 0.6087296219775453
This test runs a single hardcoded query against Tiledb data resident in S3 and collects tiledb-stats to investigate why there is significant variation in query time between multiple runs of the same query that about 30 - 60 seconds apart.
Note that when the tiledb array is on local disk, the same query does not have wildly varying query times over multiple runs when waiting several seconds or minutes between runs
$ time python3 profile_single_query_s3.py --data-location s3 --profile-type tiledb-stats
real 0m7.246s
user 0m2.200s
sys 0m1.900s
{'counters': {'Context.StorageManager.Query.Reader.attr_fixed_num': 3,
'Context.StorageManager.Query.Reader.attr_num': 5,
'Context.StorageManager.Query.Reader.attr_var_num': 2,
'Context.StorageManager.Query.Reader.cell_num': 910000,
'Context.StorageManager.Query.Reader.dim_num': 2,
'Context.StorageManager.Query.Reader.dim_var_num': 2,
'Context.StorageManager.Query.Reader.ignored_tiles': 0,
'Context.StorageManager.Query.Reader.internal_loop_num': 1,
'Context.StorageManager.Query.Reader.loop_num': 1,
'Context.StorageManager.Query.Reader.num_tiles_read': 728,
'Context.StorageManager.Query.Reader.read_unfiltered_byte_num': 73981040,
'Context.StorageManager.Query.Reader.result_num': 419298,
'Context.StorageManager.VFS.file_size_num': 1,
'Context.StorageManager.VFS.read_byte_num': 705308023,
'Context.StorageManager.VFS.read_ops_num': 126,
'Context.StorageManager.read_array_meta_size': 0,
'Context.StorageManager.read_array_schema_size': 960,
'Context.StorageManager.read_frag_meta_size': 1584,
'Context.StorageManager.read_rtree_size': 3244564,
'Context.StorageManager.read_tile_offsets_size': 1388224,
'Context.StorageManager.read_tile_var_offsets_size': 867640,
'Context.StorageManager.read_tile_var_sizes_size': 867640,
'Context.StorageManager.read_unfiltered_byte_num': 6369028},
'python': '{"py.buffer_conversion_time": 0.0004360675811767578, '
'"py.core_read_query_initial_submit_time": 6.460057297, '
'"py.core_read_query_total_time": 6.4605091, "py.getitem_time": '
'0.0017042160034179688, "py.getitem_time.add_ranges": '
'0.00015687942504882812, "py.pandas_index_update_time": '
'5.269050598144531e-05, "py.query_retries_count": 0.0}',
'timers': {'Context.StorageManager.Query.Reader.apply_query_condition.avg': 2.84e-07,
'Context.StorageManager.Query.Reader.apply_query_condition.sum': 2.84e-07,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.avg': 7.83214e-05,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.sum': 0.00689228,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.avg': 0.00227109,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.sum': 0.00227109,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.avg': 0.00113583,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.sum': 0.00340748,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.avg': 0.00233254,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.sum': 0.00933016,
'Context.StorageManager.Query.Reader.copy_var_tiles.avg': 0.0013499,
'Context.StorageManager.Query.Reader.copy_var_tiles.sum': 0.00539961,
'Context.StorageManager.Query.Reader.create_result_tiles.avg': 0.000163552,
'Context.StorageManager.Query.Reader.create_result_tiles.sum': 0.000163552,
'Context.StorageManager.Query.Reader.dowork.avg': 6.45981,
'Context.StorageManager.Query.Reader.dowork.sum': 6.45981,
'Context.StorageManager.Query.Reader.load_initial_data.avg': 0.0792599,
'Context.StorageManager.Query.Reader.load_initial_data.sum': 0.0792599,
'Context.StorageManager.Query.Reader.load_tile_offsets.avg': 0.0555098,
'Context.StorageManager.Query.Reader.load_tile_offsets.sum': 0.166529,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.avg': 0.0515541,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.sum': 0.0515541,
'Context.StorageManager.Query.Reader.process_tiles.avg': 5.89972,
'Context.StorageManager.Query.Reader.process_tiles.sum': 5.89972,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.avg': 5.87673,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.sum': 5.87673,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.avg': 0.259972,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.sum': 0.259972,
'Context.StorageManager.Query.Reader.read_attribute_tiles.avg': 2.90949,
'Context.StorageManager.Query.Reader.read_attribute_tiles.sum': 5.81897,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.avg': 0.120042,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.sum': 0.240084,
'Context.StorageManager.Query.Reader.read_tiles.avg': 1.51476,
'Context.StorageManager.Query.Reader.read_tiles.sum': 6.05904,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.avg': 3.407e-06,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.sum': 3.407e-06,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.avg': 0.000255104,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.sum': 0.00102042,
'Context.StorageManager.Query.Reader.tile_offset_sizes.avg': 7.313e-06,
'Context.StorageManager.Query.Reader.tile_offset_sizes.sum': 7.313e-06,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.avg': 0.00651393,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.sum': 0.0325696,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.avg': 0.00391671,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.sum': 0.0195835,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.avg': 0.0703353,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.sum': 0.0703353,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.avg': 0.077721,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.sum': 0.077721,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.avg': 0.0709293,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.sum': 0.0709293,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.avg': 0.0765407,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.sum': 0.0765407,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.avg': 0.0717963,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.sum': 0.0717963,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.avg': 1.495e-06,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.sum': 1.495e-06,
'Context.StorageManager.array_open_READ.avg': 0.243471,
'Context.StorageManager.array_open_READ.sum': 0.243471,
'Context.StorageManager.array_open_read_load_directory.avg': 0.0777874,
'Context.StorageManager.array_open_read_load_directory.sum': 0.0777874,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.avg': 0.165674,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.sum': 0.165674,
'Context.StorageManager.sm_load_all_array_schemas.avg': 0.0199367,
'Context.StorageManager.sm_load_all_array_schemas.sum': 0.0199367,
'Context.StorageManager.sm_load_array_metadata.avg': 7.046e-06,
'Context.StorageManager.sm_load_array_metadata.sum': 7.046e-06,
'Context.StorageManager.sm_load_array_schema_from_uri.avg': 0.0199263,
'Context.StorageManager.sm_load_array_schema_from_uri.sum': 0.0199263,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.avg': 0.165671,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.sum': 0.165671,
'Context.StorageManager.sm_load_filtered_fragment_uris.avg': 3.2699e-05,
'Context.StorageManager.sm_load_filtered_fragment_uris.sum': 3.2699e-05,
'Context.StorageManager.sm_load_fragment_metadata.avg': 0.14569,
'Context.StorageManager.sm_load_fragment_metadata.sum': 0.14569}}
$ time python3 profile_single_query_s3.py --data-location s3 --profile-type tiledb-stats
real 0m3.958s
user 0m2.076s
sys 0m1.642s
{'counters': {'Context.StorageManager.Query.Reader.attr_fixed_num': 3,
'Context.StorageManager.Query.Reader.attr_num': 5,
'Context.StorageManager.Query.Reader.attr_var_num': 2,
'Context.StorageManager.Query.Reader.cell_num': 910000,
'Context.StorageManager.Query.Reader.dim_num': 2,
'Context.StorageManager.Query.Reader.dim_var_num': 2,
'Context.StorageManager.Query.Reader.ignored_tiles': 0,
'Context.StorageManager.Query.Reader.internal_loop_num': 1,
'Context.StorageManager.Query.Reader.loop_num': 1,
'Context.StorageManager.Query.Reader.num_tiles_read': 728,
'Context.StorageManager.Query.Reader.read_unfiltered_byte_num': 73981040,
'Context.StorageManager.Query.Reader.result_num': 419298,
'Context.StorageManager.VFS.file_size_num': 1,
'Context.StorageManager.VFS.read_byte_num': 705308023,
'Context.StorageManager.VFS.read_ops_num': 126,
'Context.StorageManager.read_array_meta_size': 0,
'Context.StorageManager.read_array_schema_size': 960,
'Context.StorageManager.read_frag_meta_size': 1584,
'Context.StorageManager.read_rtree_size': 3244564,
'Context.StorageManager.read_tile_offsets_size': 1388224,
'Context.StorageManager.read_tile_var_offsets_size': 867640,
'Context.StorageManager.read_tile_var_sizes_size': 867640,
'Context.StorageManager.read_unfiltered_byte_num': 6369028},
'python': '{"py.buffer_conversion_time": 0.0004315376281738281, '
'"py.core_read_query_initial_submit_time": 3.244939281, '
'"py.core_read_query_total_time": 3.245374541, "py.getitem_time": '
'0.001729726791381836, "py.getitem_time.add_ranges": '
'0.00015234947204589844, "py.pandas_index_update_time": '
'5.888938903808594e-05, "py.query_retries_count": 0.0}',
'timers': {'Context.StorageManager.Query.Reader.apply_query_condition.avg': 1.73e-07,
'Context.StorageManager.Query.Reader.apply_query_condition.sum': 1.73e-07,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.avg': 7.93653e-05,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.sum': 0.00698415,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.avg': 0.00231364,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.sum': 0.00231364,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.avg': 0.00116303,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.sum': 0.0034891,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.avg': 0.00237875,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.sum': 0.00951498,
'Context.StorageManager.Query.Reader.copy_var_tiles.avg': 0.0013519,
'Context.StorageManager.Query.Reader.copy_var_tiles.sum': 0.00540759,
'Context.StorageManager.Query.Reader.create_result_tiles.avg': 0.00017221,
'Context.StorageManager.Query.Reader.create_result_tiles.sum': 0.00017221,
'Context.StorageManager.Query.Reader.dowork.avg': 3.24464,
'Context.StorageManager.Query.Reader.dowork.sum': 3.24464,
'Context.StorageManager.Query.Reader.load_initial_data.avg': 0.0445728,
'Context.StorageManager.Query.Reader.load_initial_data.sum': 0.0445728,
'Context.StorageManager.Query.Reader.load_tile_offsets.avg': 0.0597256,
'Context.StorageManager.Query.Reader.load_tile_offsets.sum': 0.179177,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.avg': 0.0493179,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.sum': 0.0493179,
'Context.StorageManager.Query.Reader.process_tiles.avg': 2.75611,
'Context.StorageManager.Query.Reader.process_tiles.sum': 2.75611,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.avg': 2.73499,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.sum': 2.73499,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.avg': 0.212649,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.sum': 0.212649,
'Context.StorageManager.Query.Reader.read_attribute_tiles.avg': 1.33839,
'Context.StorageManager.Query.Reader.read_attribute_tiles.sum': 2.67678,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.avg': 0.096224,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.sum': 0.192448,
'Context.StorageManager.Query.Reader.read_tiles.avg': 0.717303,
'Context.StorageManager.Query.Reader.read_tiles.sum': 2.86921,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.avg': 4.009e-06,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.sum': 4.009e-06,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.avg': 0.000278483,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.sum': 0.00111393,
'Context.StorageManager.Query.Reader.tile_offset_sizes.avg': 6.549e-06,
'Context.StorageManager.Query.Reader.tile_offset_sizes.sum': 6.549e-06,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.avg': 0.0065119,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.sum': 0.0325595,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.avg': 0.00397223,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.sum': 0.0198612,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.avg': 0.0840461,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.sum': 0.0840461,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.avg': 0.0862886,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.sum': 0.0862886,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.avg': 0.0854544,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.sum': 0.0854544,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.avg': 0.0844494,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.sum': 0.0844494,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.avg': 0.0843779,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.sum': 0.0843779,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.avg': 1.519e-06,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.sum': 1.519e-06,
'Context.StorageManager.array_open_READ.avg': 0.150774,
'Context.StorageManager.array_open_READ.sum': 0.150774,
'Context.StorageManager.array_open_read_load_directory.avg': 0.0864629,
'Context.StorageManager.array_open_read_load_directory.sum': 0.0864629,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.avg': 0.0643007,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.sum': 0.0643007,
'Context.StorageManager.sm_load_all_array_schemas.avg': 0.0209265,
'Context.StorageManager.sm_load_all_array_schemas.sum': 0.0209265,
'Context.StorageManager.sm_load_array_metadata.avg': 7.497e-06,
'Context.StorageManager.sm_load_array_metadata.sum': 7.497e-06,
'Context.StorageManager.sm_load_array_schema_from_uri.avg': 0.0209128,
'Context.StorageManager.sm_load_array_schema_from_uri.sum': 0.0209128,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.avg': 0.0642973,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.sum': 0.0642973,
'Context.StorageManager.sm_load_filtered_fragment_uris.avg': 4.1077e-05,
'Context.StorageManager.sm_load_filtered_fragment_uris.sum': 4.1077e-05,
'Context.StorageManager.sm_load_fragment_metadata.avg': 0.0433201,
'Context.StorageManager.sm_load_fragment_metadata.sum': 0.0433201}}
$ time python3 profile_single_query_s3.py --data-location s3 --profile-type tiledb-stats
real 0m4.850s
user 0m2.000s
sys 0m1.420s
{'counters': {'Context.StorageManager.Query.Reader.attr_fixed_num': 3,
'Context.StorageManager.Query.Reader.attr_num': 5,
'Context.StorageManager.Query.Reader.attr_var_num': 2,
'Context.StorageManager.Query.Reader.cell_num': 910000,
'Context.StorageManager.Query.Reader.dim_num': 2,
'Context.StorageManager.Query.Reader.dim_var_num': 2,
'Context.StorageManager.Query.Reader.ignored_tiles': 0,
'Context.StorageManager.Query.Reader.internal_loop_num': 1,
'Context.StorageManager.Query.Reader.loop_num': 1,
'Context.StorageManager.Query.Reader.num_tiles_read': 728,
'Context.StorageManager.Query.Reader.read_unfiltered_byte_num': 73981040,
'Context.StorageManager.Query.Reader.result_num': 419298,
'Context.StorageManager.VFS.file_size_num': 1,
'Context.StorageManager.VFS.read_byte_num': 705308023,
'Context.StorageManager.VFS.read_ops_num': 126,
'Context.StorageManager.read_array_meta_size': 0,
'Context.StorageManager.read_array_schema_size': 960,
'Context.StorageManager.read_frag_meta_size': 1584,
'Context.StorageManager.read_rtree_size': 3244564,
'Context.StorageManager.read_tile_offsets_size': 1388224,
'Context.StorageManager.read_tile_var_offsets_size': 867640,
'Context.StorageManager.read_tile_var_sizes_size': 867640,
'Context.StorageManager.read_unfiltered_byte_num': 6369028},
'python': '{"py.buffer_conversion_time": 0.00042366981506347656, '
'"py.core_read_query_initial_submit_time": 4.141099672, '
'"py.core_read_query_total_time": 4.141520283, "py.getitem_time": '
'0.0017459392547607422, "py.getitem_time.add_ranges": '
'0.00015592575073242188, "py.pandas_index_update_time": '
'5.602836608886719e-05, "py.query_retries_count": 0.0}',
'timers': {'Context.StorageManager.Query.Reader.apply_query_condition.avg': 2.18e-07,
'Context.StorageManager.Query.Reader.apply_query_condition.sum': 2.18e-07,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.avg': 8.2695e-05,
'Context.StorageManager.Query.Reader.compute_results_count_sparse.sum': 0.00727716,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.avg': 0.00271688,
'Context.StorageManager.Query.Reader.compute_tile_bitmaps.sum': 0.00271688,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.avg': 0.00113459,
'Context.StorageManager.Query.Reader.copy_fixed_data_tiles.sum': 0.00340378,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.avg': 0.00235719,
'Context.StorageManager.Query.Reader.copy_offsets_tiles.sum': 0.00942877,
'Context.StorageManager.Query.Reader.copy_var_tiles.avg': 0.00130936,
'Context.StorageManager.Query.Reader.copy_var_tiles.sum': 0.00523746,
'Context.StorageManager.Query.Reader.create_result_tiles.avg': 0.000160009,
'Context.StorageManager.Query.Reader.create_result_tiles.sum': 0.000160009,
'Context.StorageManager.Query.Reader.dowork.avg': 4.14086,
'Context.StorageManager.Query.Reader.dowork.sum': 4.14086,
'Context.StorageManager.Query.Reader.load_initial_data.avg': 0.0516287,
'Context.StorageManager.Query.Reader.load_initial_data.sum': 0.0516287,
'Context.StorageManager.Query.Reader.load_tile_offsets.avg': 0.0648514,
'Context.StorageManager.Query.Reader.load_tile_offsets.sum': 0.194554,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.avg': 0.0574084,
'Context.StorageManager.Query.Reader.load_tile_var_sizes.sum': 0.0574084,
'Context.StorageManager.Query.Reader.process_tiles.avg': 3.56619,
'Context.StorageManager.Query.Reader.process_tiles.sum': 3.56619,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.avg': 3.54313,
'Context.StorageManager.Query.Reader.read_and_unfilter_attributes.sum': 3.54313,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.avg': 0.267875,
'Context.StorageManager.Query.Reader.read_and_unfilter_coords.sum': 0.267875,
'Context.StorageManager.Query.Reader.read_attribute_tiles.avg': 1.74258,
'Context.StorageManager.Query.Reader.read_attribute_tiles.sum': 3.48515,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.avg': 0.123706,
'Context.StorageManager.Query.Reader.read_coordinate_tiles.sum': 0.247412,
'Context.StorageManager.Query.Reader.read_tiles.avg': 0.933138,
'Context.StorageManager.Query.Reader.read_tiles.sum': 3.73255,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.avg': 4.769e-06,
'Context.StorageManager.Query.Reader.resize_fixed_results_to_copy.sum': 4.769e-06,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.avg': 0.000255998,
'Context.StorageManager.Query.Reader.switch_sizes_to_offsets.sum': 0.00102399,
'Context.StorageManager.Query.Reader.tile_offset_sizes.avg': 6.61e-06,
'Context.StorageManager.Query.Reader.tile_offset_sizes.sum': 6.61e-06,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.avg': 0.00649242,
'Context.StorageManager.Query.Reader.unfilter_attr_tiles.sum': 0.0324621,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.avg': 0.00402229,
'Context.StorageManager.Query.Reader.unfilter_coord_tiles.sum': 0.0201115,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.avg': 0.0708566,
'Context.StorageManager.VFS.ArrayDirectory.list_array_meta_uris.sum': 0.0708566,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.avg': 0.0766091,
'Context.StorageManager.VFS.ArrayDirectory.list_array_schema_uris.sum': 0.0766091,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.avg': 0.0768642,
'Context.StorageManager.VFS.ArrayDirectory.list_commit_uris.sum': 0.0768642,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.avg': 0.0742092,
'Context.StorageManager.VFS.ArrayDirectory.list_fragment_meta_uris.sum': 0.0742092,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.avg': 0.0906188,
'Context.StorageManager.VFS.ArrayDirectory.list_root_uris.sum': 0.0906188,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.avg': 2.198e-06,
'Context.StorageManager.VFS.ArrayDirectory.load_consolidated_commit_uris.sum': 2.198e-06,
'Context.StorageManager.array_open_READ.avg': 0.161641,
'Context.StorageManager.array_open_READ.sum': 0.161641,
'Context.StorageManager.array_open_read_load_directory.avg': 0.0908443,
'Context.StorageManager.array_open_read_load_directory.sum': 0.0908443,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.avg': 0.0707864,
'Context.StorageManager.array_open_read_load_schemas_and_fragment_meta.sum': 0.0707864,
'Context.StorageManager.sm_load_all_array_schemas.avg': 0.0227951,
'Context.StorageManager.sm_load_all_array_schemas.sum': 0.0227951,
'Context.StorageManager.sm_load_array_metadata.avg': 7.044e-06,
'Context.StorageManager.sm_load_array_metadata.sum': 7.044e-06,
'Context.StorageManager.sm_load_array_schema_from_uri.avg': 0.0227812,
'Context.StorageManager.sm_load_array_schema_from_uri.sum': 0.0227812,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.avg': 0.0707832,
'Context.StorageManager.sm_load_array_schemas_and_fragment_metadata.sum': 0.0707832,
'Context.StorageManager.sm_load_filtered_fragment_uris.avg': 4.8789e-05,
'Context.StorageManager.sm_load_filtered_fragment_uris.sum': 4.8789e-05,
'Context.StorageManager.sm_load_fragment_metadata.avg': 0.0479296,
'Context.StorageManager.sm_load_fragment_metadata.sum': 0.0479296}}
@atarashansky @dsadgat @atolopko-czi
Based on the investigation above, we are reasonably confident that moving the snapshot to the web application's local disk during the deployment/initialization will significantly reduce the tiledb query times and give us sub-second query times for our most expensive queries.
The reason for this is because S3 is not a dedicated resource like the local disk. S3 data retrieval times depend on various factors such as caching policy, load on the S3 service, network load, etc. The tiledb stats above seem to indicate that a vast amount of the query time is spent on I/O (see Context.StorageManager.Query.Reader.read_*
times) and this includes network I/O. We have good reason to suspect network I/O is the bottleneck when reading the data from S3 and that bottleneck is significant - local disk reads are 5-10X faster than reads from S3.
It has also been suggested that we take a look at the high performance S3 one zone storage class (thanks @bkmartinjr ) which purports to deliver "consistent single-digit millisecond data access for your most frequently accessed data and latency-sensitive applications. S3 Express One Zone can improve data access speeds by 10x and reduce request costs by 50% compared to S3 Standard and scales to process millions of requests per minute."
There is another option which seems even more promising - Mount Point for Amazon S3 suggested by @atolopko-czi. Per the documentation: Mountpoint for Amazon S3 can now cache data in Amazon EC2 instance storage, instance memory, or an Amazon EBS volume for repeated data access. This avoids redundant requests to Amazon S3 when you need to read the same data multiple times, which improves both the cost and performance of your application. For example, by using Mountpoint for Amazon S3 and caching data in Amazon EC2 instance storage, you can complete your machine learning training jobs up to 2x faster by avoiding redundant requests to Amazon S3.
If the Mount Point for S3 indeed brings query latencies to acceptable levels, then that might be a faster to implement and less error prone than copying the snapshot to local disk.
There is also pending investigation on using postgres
as the data store for WMG cube. Initial investigations by @atolopko-czi look promising for all of GE query patterns. Particularly it shows the promise in reducing rollup
operation from 8-10 seconds to <1 second (a 10X improvement). But we are ways away from committing to it.
Decide if Mount Point obviates the need for copying tiledb data files to local disk. high performance S3 one zone storage class is not included in the list of candidate solutions to investigate because Mount Point seems like a logically better solution.
mount-s3: 0.74 seconds
vs s3: 1.23 seconds
vs local-disk: 0.28 seconds
(comparing the last 90 runs to account for the effect of cache warming in the first 10 runs)NOTE: query is for 50 genes over ALL tissues
# S3
$ python profile_single_query_s3.py --data-location s3 --profile-type simple-wallclock
Query Time: 6.748166222999998
# MOUNT-S3
$ python profile_single_query_s3.py --data-location mount-s3 --profile-type simple-wallclock
Query Time: 1.5921803740000087
# LOCAL DISK
$ python profile_single_query_s3.py --data-location local --profile-type simple-wallclock
Query Time: 0.6546747279999181
NOTE: query is for 50 genes over ALL tissues
Caching is enabled when mount-s3
is configured:
$ mount-s3 cellxgene-wmg-staging wmg-cube-mount-s3 --cache cache-wmg-cube-mount-s3 --metadata-ttl 300
$ python profile_expr_summary.py --data-location mount-s3 --profile-type simple-wallclock --profile-scope raw-query --num-trials 100 --random True
#####Simple Wallclock Profiling RAW QUERY#####
num_trials: 100, num_genes_per_trial: 50, randomize_queries: True, wait_time: 0.1
total_query_time: 84.93799073400032, avg_query_time: 0.8493799073400032
trial 0: query_time: 4.458109610000065
trial 1: query_time: 2.5819802759999675
trial 2: query_time: 1.654652904000045
trial 3: query_time: 1.3077535179999131
trial 4: query_time: 1.9769351569999571
trial 5: query_time: 1.1004632560000118
trial 6: query_time: 1.3345652950000613
trial 7: query_time: 1.7758604320000586
trial 8: query_time: 0.7463126810000631
trial 9: query_time: 1.1715542599999935
trial 10: query_time: 1.2971961010000541
trial 11: query_time: 1.499866001999976
trial 12: query_time: 0.7727857449999647
trial 13: query_time: 0.6348842930000274
trial 14: query_time: 1.1153419169999097
trial 15: query_time: 0.9048437559999911
trial 16: query_time: 0.7109619850000399
trial 17: query_time: 0.894831138000086
trial 18: query_time: 1.188690336000036
trial 19: query_time: 0.8177241539999613
trial 20: query_time: 0.6851584220000859
trial 21: query_time: 0.5988486710000416
trial 22: query_time: 0.5904493919999823
trial 23: query_time: 0.5490066790000583
trial 24: query_time: 1.0126316700000189
trial 25: query_time: 0.8298233699999855
trial 26: query_time: 0.5345813519999183
trial 27: query_time: 0.662106289999997
trial 28: query_time: 0.5923850259999881
trial 29: query_time: 0.6814443660000506
trial 30: query_time: 0.8294042369999488
trial 31: query_time: 0.5753452720000496
trial 32: query_time: 0.7977175329999682
trial 33: query_time: 0.6071753160000526
trial 34: query_time: 0.6226004379999495
trial 35: query_time: 0.6575300069999912
trial 36: query_time: 0.7002456540000139
trial 37: query_time: 1.285965970999996
trial 38: query_time: 1.1663920839999946
trial 39: query_time: 0.7513186669999641
trial 40: query_time: 0.656368059999977
trial 41: query_time: 0.7102322559999266
trial 42: query_time: 0.5516998939999667
trial 43: query_time: 0.830023180000012
trial 44: query_time: 0.8122672350000357
trial 45: query_time: 0.630903529999955
trial 46: query_time: 1.4243341730000338
trial 47: query_time: 1.1057274869999674
trial 48: query_time: 0.5985588050000388
trial 49: query_time: 0.9679796819999638
trial 50: query_time: 1.025151289000064
trial 51: query_time: 0.7352055889999747
trial 52: query_time: 0.6162365740000268
trial 53: query_time: 0.5753514220000397
trial 54: query_time: 0.7383069479999449
trial 55: query_time: 0.8410589269999491
trial 56: query_time: 0.9383649459999788
trial 57: query_time: 0.897521561000076
trial 58: query_time: 1.0693073559999675
trial 59: query_time: 0.5501075309999806
trial 60: query_time: 0.5846752400000241
trial 61: query_time: 0.7247286850000592
trial 62: query_time: 0.7725954569999658
trial 63: query_time: 0.7867966049999495
trial 64: query_time: 0.5920192209999868
trial 65: query_time: 0.8036754490000249
trial 66: query_time: 0.5323291160000281
trial 67: query_time: 0.6382877339999595
trial 68: query_time: 0.5746756229999619
trial 69: query_time: 0.6052720100000215
trial 70: query_time: 0.5027580239999452
trial 71: query_time: 0.711808700000006
trial 72: query_time: 0.7767928340000481
trial 73: query_time: 0.6472635080000373
trial 74: query_time: 0.9157147470000382
trial 75: query_time: 0.7538526939999883
trial 76: query_time: 0.688559142000031
trial 77: query_time: 0.6730260580000049
trial 78: query_time: 0.8670512950000102
trial 79: query_time: 0.522248191000017
trial 80: query_time: 0.6768851660000337
trial 81: query_time: 0.5359373399999185
trial 82: query_time: 0.621782244999963
trial 83: query_time: 0.628841972000032
trial 84: query_time: 0.6184606140000142
trial 85: query_time: 0.8015787379999892
trial 86: query_time: 0.587429942999961
trial 87: query_time: 0.5087339709999696
trial 88: query_time: 0.5376343399999541
trial 89: query_time: 0.5914901419999978
trial 90: query_time: 0.6029010980000749
trial 91: query_time: 0.6341135149999673
trial 92: query_time: 0.6610181510000075
trial 93: query_time: 0.4858367790000102
trial 94: query_time: 0.5385020109999914
trial 95: query_time: 0.5957841180000969
trial 96: query_time: 0.52943291400004
trial 97: query_time: 0.8288854480000509
trial 98: query_time: 0.7310777010000038
trial 99: query_time: 0.5973845170000232
NOTE: query is for 50 genes over ALL tissues
$ python profile_expr_summary.py --data-location local --profile-type simple-wallclock --profile-scope raw-query --num-trials 100 --random True
#####Simple Wallclock Profiling RAW QUERY#####
num_trials: 100, num_genes_per_trial: 50, randomize_queries: True, wait_time: 0.1
total_query_time: 32.65206927299994, avg_query_time: 0.3265206927299994
trial 0: query_time: 1.4072613220000108
trial 1: query_time: 2.3759740790000023
trial 2: query_time: 0.5368550270000014
trial 3: query_time: 0.9578235669999913
trial 4: query_time: 0.28350593100000765
trial 5: query_time: 0.30751738200001455
trial 6: query_time: 0.31380537399999753
trial 7: query_time: 0.32840677999999457
trial 8: query_time: 0.29934287099999324
trial 9: query_time: 0.22024661400001833
trial 10: query_time: 0.31282072999999855
trial 11: query_time: 0.29767627600000424
trial 12: query_time: 0.2902254990000017
trial 13: query_time: 0.35103022499998815
trial 14: query_time: 0.3035185159999969
trial 15: query_time: 0.31603130199999896
trial 16: query_time: 0.24937240499997415
trial 17: query_time: 0.26602102399999694
trial 18: query_time: 0.3001532619999807
trial 19: query_time: 0.3459727100000123
trial 20: query_time: 0.21078897999998958
trial 21: query_time: 0.3082457399999896
trial 22: query_time: 0.28248107899997876
trial 23: query_time: 0.3546332549999818
trial 24: query_time: 0.279284964999988
trial 25: query_time: 0.17006857799998443
trial 26: query_time: 0.23051406100000804
trial 27: query_time: 0.32190842399998587
trial 28: query_time: 0.2723682710000048
trial 29: query_time: 0.283731854999985
trial 30: query_time: 0.3090172530000075
trial 31: query_time: 0.31138136100000224
trial 32: query_time: 0.2748879780000095
trial 33: query_time: 0.32576917999998045
trial 34: query_time: 0.18598574300000337
trial 35: query_time: 0.29140421600001787
trial 36: query_time: 0.23635345899998583
trial 37: query_time: 0.37989791000001105
trial 38: query_time: 0.28009902400000897
trial 39: query_time: 0.26255842000000484
trial 40: query_time: 0.27622637199999645
trial 41: query_time: 0.22819967400002383
trial 42: query_time: 0.31207555200001025
trial 43: query_time: 0.2586563610000212
trial 44: query_time: 0.3080421359999832
trial 45: query_time: 0.2868817690000185
trial 46: query_time: 0.2384845359999872
trial 47: query_time: 0.2630379590000018
trial 48: query_time: 0.2690308609999761
trial 49: query_time: 0.20165372400001047
trial 50: query_time: 0.28473853200000576
trial 51: query_time: 0.25942350999997643
trial 52: query_time: 0.3189702759999875
trial 53: query_time: 0.2464530340000124
trial 54: query_time: 0.2984945339999854
trial 55: query_time: 0.3433219669999801
trial 56: query_time: 0.30183743500001015
trial 57: query_time: 0.22690623099998675
trial 58: query_time: 0.34041376700000114
trial 59: query_time: 0.22151536000001215
trial 60: query_time: 0.2960845069999891
trial 61: query_time: 0.22988944299999048
trial 62: query_time: 0.25595664900001225
trial 63: query_time: 0.3659249149999937
trial 64: query_time: 0.2777073929999858
trial 65: query_time: 0.30865272500000174
trial 66: query_time: 0.37089284099999986
trial 67: query_time: 0.2557718740000041
trial 68: query_time: 0.27712515899997925
trial 69: query_time: 0.26933869500001606
trial 70: query_time: 0.21371085300000914
trial 71: query_time: 0.2876549640000121
trial 72: query_time: 0.3173794589999943
trial 73: query_time: 0.2855394560000093
trial 74: query_time: 0.2950028239999938
trial 75: query_time: 0.2466388489999929
trial 76: query_time: 0.2891999789999886
trial 77: query_time: 0.35536709699999847
trial 78: query_time: 0.32608064799998715
trial 79: query_time: 0.37119952800000533
trial 80: query_time: 0.25508051799999976
trial 81: query_time: 0.3064294440000026
trial 82: query_time: 0.2668080850000081
trial 83: query_time: 0.2993090690000031
trial 84: query_time: 0.24298014100000387
trial 85: query_time: 0.25985804500001564
trial 86: query_time: 0.2653288660000044
trial 87: query_time: 0.3573647000000051
trial 88: query_time: 0.26351055300000326
trial 89: query_time: 0.36487833000001046
trial 90: query_time: 0.3459658890000128
trial 91: query_time: 0.22377544799999782
trial 92: query_time: 0.2584887149999986
trial 93: query_time: 0.33219296099997564
trial 94: query_time: 0.25037413499998706
trial 95: query_time: 0.2965343940000196
trial 96: query_time: 0.25898273100000324
trial 97: query_time: 0.25781055499999184
trial 98: query_time: 0.25261749200001304
trial 99: query_time: 0.2793571060000204
NOTE: query is for 50 genes over ALL tissues
$ python profile_expr_summary.py --data-location s3 --profile-type simple-wallclock --profile-scope raw-query --num-trials 100 --random True
#####Simple Wallclock Profiling RAW QUERY#####
num_trials: 100, num_genes_per_trial: 50, randomize_queries: True, wait_time: 0.1
total_query_time: 135.46871183299987, avg_query_time: 1.3546871183299987
trial 0: query_time: 4.003985972999999
trial 1: query_time: 3.573740692000001
trial 2: query_time: 2.509278213000016
trial 3: query_time: 2.9792625430000044
trial 4: query_time: 3.0536960640000075
trial 5: query_time: 1.9944726909999986
trial 6: query_time: 1.0862587259999827
trial 7: query_time: 1.3573313630000143
trial 8: query_time: 1.9742172940000273
trial 9: query_time: 1.6433746779999865
trial 10: query_time: 1.4304261359999941
trial 11: query_time: 1.9637260859999515
trial 12: query_time: 1.6831383580000079
trial 13: query_time: 1.606910973999959
trial 14: query_time: 0.9548125299999697
trial 15: query_time: 0.9560846659999811
trial 16: query_time: 1.2880271770000036
trial 17: query_time: 1.0503580980000038
trial 18: query_time: 1.1362863399999696
trial 19: query_time: 1.3571949529999756
trial 20: query_time: 1.871374163999974
trial 21: query_time: 1.2705959780000171
trial 22: query_time: 1.1581272490000174
trial 23: query_time: 1.5232975200000283
trial 24: query_time: 1.7971981979999896
trial 25: query_time: 1.6522054189999835
trial 26: query_time: 1.819237438000016
trial 27: query_time: 1.572304264999957
trial 28: query_time: 1.9447415980000073
trial 29: query_time: 1.2144866610000236
trial 30: query_time: 0.9088127830000303
trial 31: query_time: 1.4932521650000012
trial 32: query_time: 0.9466949410000325
trial 33: query_time: 1.5894347000000266
trial 34: query_time: 1.1557056170000237
trial 35: query_time: 1.0891182670000035
trial 36: query_time: 1.1767906669999775
trial 37: query_time: 1.1259856920000288
trial 38: query_time: 1.6331904759999816
trial 39: query_time: 1.028848326000002
trial 40: query_time: 1.074094936999984
trial 41: query_time: 0.8739587820000452
trial 42: query_time: 1.6000312780000172
trial 43: query_time: 1.3146131400000058
trial 44: query_time: 0.9708979579999664
trial 45: query_time: 1.0968067150000138
trial 46: query_time: 1.0503926040000238
trial 47: query_time: 1.2663111580000077
trial 48: query_time: 1.4521793879999905
trial 49: query_time: 1.004449396000041
trial 50: query_time: 1.231613901000003
trial 51: query_time: 0.8825964750000139
trial 52: query_time: 1.1230223580000143
trial 53: query_time: 1.026162338000006
trial 54: query_time: 1.2339305140000079
trial 55: query_time: 1.4468387879999796
trial 56: query_time: 0.8095109519999824
trial 57: query_time: 1.0504910480000262
trial 58: query_time: 1.5855452669999863
trial 59: query_time: 0.9767604419999998
trial 60: query_time: 0.9815348839999842
trial 61: query_time: 1.4171188699999675
trial 62: query_time: 1.158444870999972
trial 63: query_time: 1.2652696480000145
trial 64: query_time: 1.035514126999999
trial 65: query_time: 1.1793643579999298
trial 66: query_time: 1.2436490709999362
trial 67: query_time: 1.214567666999983
trial 68: query_time: 1.2286974169999212
trial 69: query_time: 0.9482812449999756
trial 70: query_time: 1.3685508480000408
trial 71: query_time: 0.9726102689999152
trial 72: query_time: 1.245883790999983
trial 73: query_time: 1.278040676000046
trial 74: query_time: 1.204819663999956
trial 75: query_time: 1.6147211539999944
trial 76: query_time: 1.137744022999982
trial 77: query_time: 1.1147740330000033
trial 78: query_time: 1.299879447999956
trial 79: query_time: 1.057496297000057
trial 80: query_time: 1.2484915649999948
trial 81: query_time: 1.0604711710000174
trial 82: query_time: 1.1205998530000443
trial 83: query_time: 1.4988737979999769
trial 84: query_time: 1.2238189800000328
trial 85: query_time: 1.2590955040000154
trial 86: query_time: 1.3017927930000042
trial 87: query_time: 1.212496326000064
trial 88: query_time: 1.2545615830000543
trial 89: query_time: 1.1321148170000015
trial 90: query_time: 1.0944500770000332
trial 91: query_time: 1.2451703630000566
trial 92: query_time: 1.2014130669999759
trial 93: query_time: 1.136929820999967
trial 94: query_time: 0.9320932310000671
trial 95: query_time: 0.8415176459999429
trial 96: query_time: 0.9394534210000529
trial 97: query_time: 0.9467821939999794
trial 98: query_time: 1.2511406579999402
trial 99: query_time: 0.956289485999946
Priority: P1 Feature: Apply Group By, 1 tissue expanded (blood, group by = disease) Load Time: 42s Desired Load Time: 5s Minimally Acceptable Load Time: 10s