icecube / skymap_scanner

A distributed system that performs a likelihood scan of event directions for IceCube real-time alerts using CPU cluster(s) and queue-based message passing.
5 stars 2 forks source link

Move GCD uncompress to server-side #150

Open mlincett opened 1 year ago

mlincett commented 1 year ago

Currently, the Skymap Scanner implements frame object diff logic for GCD information.

Alerts coming from the pole carry so-called "GCD diff" frames, in order to reduce bandwidth compared to sending the whole GCD package to North. In presence of a GCD diff, the Skymap Scanner fetches the baseline GCD and rebuilds the full GCD information (uncompress).

The current functionality is as follows:

I see no benefit in performing this operation both server-side and client-side. It should be possible to move the uncompress stage to the server side and have the client work on the full GCD only.

dsschult commented 1 year ago

There is one benefit: sending a smaller GCD file to each client. I'm not sure if there are any performance implications of sending the full GCD to each client, since 30MB x 1000 clients is a bit large if they all start at the same time. We can of course work around that with some cleverness (host the GCD on a fast server).

mlincett commented 1 year ago

Thanks for pointing this out, I did not consider this aspect.

If this is really a concern then we should take the problem backwards. There are situations in which the full GCD transfer is currently forced and that may potentially be avoided:

dsschult commented 1 year ago

I did some basic testing, and I think now that we're using S3 to transfer input files with skydriver compression isn't an issue. We probably do want to separate the GCD from the json (otherwise that's a huge base64 blob in there).

mlincett commented 1 year ago

I did some basic testing, and I think now that we're using S3 to transfer input files with skydriver compression isn't an issue. We probably do want to separate the GCD from the json (otherwise that's a huge base64 blob in there).

So the idea is that SkyDriver would write the GCD to an S3 bucket and the server would pass the object URL to the client?

dsschult commented 1 year ago

I did some basic testing, and I think now that we're using S3 to transfer input files with skydriver compression isn't an issue. We probably do want to separate the GCD from the json (otherwise that's a huge base64 blob in there).

So the idea is that SkyDriver would write the GCD to an S3 bucket and the server would pass the object URL to the client?

Close. It can actually pass the object URL to HTCondor, which will download it and put it in the directory the client starts in. So for the client, it looks like the GCD file is in $PWD.

dsschult commented 1 year ago

I should note that if running this manually, you can also transfer the GCD directly via condor file transfer or any other method. So this would work outside SkyDriver.