CARTAvis / carta-backend

Source code repository for the backend component of CARTA, a new visualization tool designed for the ALMA, the VLA and the SKA pathfinders.
https://cartavis.github.io/
GNU General Public License v3.0
22 stars 11 forks source link

Fix unintentional insertion of null pointers into the sessions map #1249

Closed confluence closed 1 year ago

confluence commented 1 year ago

Description

This fixes #1248.

If a scripting request was made with an invalid session ID, a null pointer was unintentionally inserted into the session manager's sessions map when the [] operator was used to access the map with a key which did not exist. Subsequently when a file was opened, the session manager would attempt to close cached file loaders for that file by iterating over the sessions in the map, and crash when it tried to call a function on the null pointer.

There were a few other places in the session manager code which were using the same logic -- these may never have triggered the issue because they were never called with a key that did not exist, or perhaps because if they ever were (e.g. because the session had already been deleted) if was shortly before shutdown. Nevertheless, I have replaced all of those instances as well -- all remaining instances of _sessions[session_id] occur while the session mutex is locked after the key is found in the map.

While I think that this change guarantees that we will never insert a null pointer into the sessions map by mistake, I'm concerned that the code doesn't currently protect against a session pointer being invalidated by the session being deleted from another thread. Fixing this would require a more careful examination of the threading logic to determine where locks (or reference counting) need to be added. I suggest discussing this in a separate issue. This particular bug was not caused by this type of race condition.

There are no companion PRs.

Checklist

confluence commented 1 year ago

I will refactor this now to use at (much cleaner!) and to use find instead of count.

It does seem that using shared pointers is the correct thing to do here and would solve a lot of problems. I think that we should just do it, but probably in a separate PR -- it's not necessary for fixing this bug, and would require careful testing.

github-actions[bot] commented 1 year ago

Code Coverage

Package Line Rate Health
src.Cache 68%
src.DataStream 52%
src.FileList 68%
src.Frame 51%
src.HttpServer 43%
src.ImageData 28%
src.ImageFitter 89%
src.ImageGenerators 52%
src.ImageStats 74%
src.Logger 44%
src.Main 54%
src.Region 19%
src.Session 30%
src.Table 52%
src.ThreadingManager 87%
src.Timer 85%
src.Util 50%
Summary 39% (6818 / 17593)
confluence commented 1 year ago

@kswang1029 I changed the logic of the changes a little. Could you please re-run the e2e tests just in case?