Closed prjemian closed 1 year ago
The tiled server console output showed this error:
pymongo.errors.OperationFailure: Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit., full error: {'ok': 0.0, 'errmsg': 'Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.', 'code': 96, 'codeName': 'OperationFailed'}
The trace points to this code: https://github.com/BCDA-APS/tiled-viz2023/blob/99dd2c8ff18a441288ad4a37771d79a7a966d957/gemviz23/demo/resultwindow.py#L96-L97
This is another aspect of #46. Closing since it is the same bug.
Fails on other catalogs with fewer runs, as well.
Not sure if this error came from the tiled server or the MongoDB server.
On a different workstation, with more RAM, this process does not fail with a catalog of almost 9k runs.
Perhaps could use cat.new_variation()
instead?
Tried slicing instead, since the new_variation()
method seems to do something different.
Same server error.
File "/home/beams/JEMIAN/.conda/envs/bluesky_2023_2/lib/python3.10/site-packages/tiled/client/node.py", line 352, in _keys_slice
content = self.context.get_json(
The tiled code referenced here has been revised:
warnings.warn( """The module 'tiled.client.node' has been moved to 'tiled.client.container' and the object 'Node' has been renamed 'Container'.""", DeprecationWarning, )
Might be worthwhile to update our local tiled versions (both server and client) before proceeding with this issue.
FYI, failing client has these versions:
(bluesky_2023_2) jemian@otz ~/.../gemviz23/demo $ conda list tiled
# packages in environment at /home/beams/JEMIAN/.conda/envs/bluesky_2023_2:
#
# Name Version Build Channel
tiled 0.1.0a91 hd8ed1ab_0 conda-forge
tiled-base 0.1.0a91 pyhd8ed1ab_0 conda-forge
tiled-client 0.1.0a91 hd8ed1ab_0 conda-forge
tiled-formats 0.1.0a91 hd8ed1ab_0 conda-forge
tiled-server 0.1.0a91 hd8ed1ab_0 conda-forge
failing server:
(tiled) jemian@otz ~/.../gemviz23/demo $ conda list tiled
# packages in environment at /home/beams/JEMIAN/.conda/envs/tiled:
#
# Name Version Build Channel
tiled 0.1.0a85 pypi_0 pypi
Ok client has these versions:
(bluesky_2023_2) prjemian@arf:~/.../gemviz23/demo$ conda list tiled
# packages in environment at /home/prjemian/.conda/envs/bluesky_2023_2:
#
# Name Version Build Channel
tiled 0.1.0a94 hd8ed1ab_0 conda-forge
tiled-base 0.1.0a94 pyhd8ed1ab_0 conda-forge
tiled-client 0.1.0a94 hd8ed1ab_0 conda-forge
tiled-formats 0.1.0a94 hd8ed1ab_0 conda-forge
tiled-server 0.1.0a94 hd8ed1ab_0 conda-forge
Ok server has these versions:
(base) prjemian@zap:~$ conda list -n tiled tiled
# packages in environment at /home/prjemian/.conda/envs/tiled:
#
# Name Version Build Channel
tiled 0.1.0a85 pypi_0 pypi
While this may be a resource problem on the server (and may be affected by #53), what can we do as a client to avoid triggering this problem?
We rely on random access to the UIDs in the table as a list:
(bluesky_2023_2) prjemian@arf:~/.../BCDA-APS/gemviz$ git grep uidList | grep -v _uidList
gemviz/resultwindow.py: value = len(self.uidList())
gemviz/resultwindow.py: uid = self.uidList()[index.row()]
gemviz/resultwindow.py: return (self.pageOffset() + len(self.uidList())) >= self.catalog_length()
gemviz/resultwindow.py: def uidList(self):
gemviz/resultwindow.py: end = start + len(self.uidList())
gemviz/resultwindow.py: uid = self.uidList()[index.row()]
gemviz/resultwindow.py: uid = self.uidList()[index.row()]
It would be a challenge to reference the UIDs through an iterator.
Still fails with changes in #53:
File "/home/beams1/JEMIAN/Documents/projects/BCDA-APS/gemviz/gemviz/bluesky_runs_catalog_table_view.py", line 58, in doPagerButtons
model.doPager(action)
File "/home/beams1/JEMIAN/Documents/projects/BCDA-APS/gemviz/gemviz/bluesky_runs_catalog_table_model.py", line 110, in doPager
self.setUidList(self._get_uidList())
^^^^^^^^^^^^^^^^^^^
File "/home/beams1/JEMIAN/Documents/projects/BCDA-APS/gemviz/gemviz/bluesky_runs_catalog_table_model.py", line 129, in _get_uidList
return list(gen) # FIXME: fails here with big catalogs, see issue #51
^^^^^^^^^
File "/home/beams/JEMIAN/.conda/envs/bluesky_2023_3/lib/python3.11/site-packages/tiled/client/container.py", line 349, in _keys_slice
content = handle_error(
^^^^^^^^^^^^^
File "/home/beams/JEMIAN/.conda/envs/bluesky_2023_3/lib/python3.11/site-packages/tiled/client/utils.py", line 18, in handle_error
response.raise_for_status()
File "/home/beams/JEMIAN/.conda/envs/bluesky_2023_3/lib/python3.11/site-packages/httpx/_models.py", line 749, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:8020/api/v1/search/9idc_usaxs_retired_2022-08-10?page%5Boffset%5D=17443&fields=&filter%5Bcomparison%5D%5Bcondition%5D%5Boperator%5D=ge&filter%5Bcomparison%5D%5Bcondition%5D%5Boperator%5D=le&filter%5Bcomparison%5D%5Bcondition%5D%5Bkey%5D=time&filter%5Bcomparison%5D%5Bcondition%5D%5Bkey%5D=time&filter%5Bcomparison%5D%5Bcondition%5D%5Bvalue%5D=1642140000.0&filter%5Bcomparison%5D%5Bcondition%5D%5Bvalue%5D=1667887199.999&sort=time'
For more information check: https://httpstatuses.com/500
Instead of moving to the far end of the catalog, setting the pageSize is an alternative. Depending on the server, this might begin to be slow. A progressbar window with a [cancel] button should be posted if the list(gen)
operation take too long.
Also, should post notice if/when the operation fails (as originally noted).
Might be an algorithm design problem in gemviz. We might be trying to do this the hard way. Instead, there may be some catalog metadata that provides the catalog length and first+last runs info.
Time to revisit the pager implementation. Can we pass this handling to the tiled server?
tiled server shows this error:
pymongo.errors.OperationFailure: Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.
Client can't see this error, only "500 Internal server error". Client can advise user to adjust filters to reduce catalog size.
Why is the PyMongo seeing a maximum RAM of 33_554_432 bytes? Can this be increased? The tiled server has a larger OBJECT CACHE.
Loaded a catalog with 88k runs. The first 20 runs displayed in the results table. After pressing the
last
pager button, the app crashed with this exception trace: