Auto-generated reports - Githubissues

seasketch / next

A modernization version of the SeaSketch platform, released in 2022.

https://seasketch.org

BSD 3-Clause "New" or "Revised" License

7 stars 0 forks source link

Auto-generated reports #812

Open wmcclin opened 1 month ago

wmcclin commented 1 month ago

It would be great if reports for sketches were started as soon as the sketch was completed. Likewise, it would be great if the reports for sketches (or collections) began generating as soon as they were posted to a forum. That way, in many cases, when the reports are inspected for the first time, it's likely that they wouldn't need any time to generate and, instead, were pulled from a cache.

underbluewaters commented 1 month ago

The key question for this one is whether opening a report when geoprocessing is already in progress would kick off a new geoprocessing request, or if it would wait for the already in-progress request. I'm not sure whether the existing caching system would handle that. If not, it wouldn't result in an improvement in perceived performance unless geoprocessing completely finished before the user opened the report.

twelch commented 1 month ago

I'll double check this at a deeper level. Abby and I did check that in SeaSketch right now, if you start a report run in one tab, then stagger open the same report in another tab, results will roll into both tabs at the same time, using the same web socket URL. Need to confirm that it's not double running the run lambda.

twelch commented 1 month ago

In short, you will definitely get an improvement, at least for async functions (almost all reports are), so you should be fine to move ahead with this. I just need to make sure it's not doing more work than it needs to.

One thing to think about is that this feature will complicate debugging in production. I think we would want to at least be able to administratively turn this feature off when necessary.

underbluewaters commented 1 month ago

Thanks for checking in on that. Should be a relatively simple thing to get SeaSketch to prompt these.

@twelch can you elaborate on how you would want to control this? Would it be project-specific for all users, user-specific, sketch-specific, global?

twelch commented 1 month ago

@underbluewaters can you share a little bit about how the auto-generating mechanism would work? Would it load a hidden iframe?

My request for being able to disable this is really just a gut reaction. After thinking it through, we probably don't need it. If a lambda fails, the error and call stack is in the dynamodb item, and that will be displayed with the user clicks to view the report.

I'm used to using seasketch as the primary report debugging tool and having view of the all the network request. In this case of auto-starting we wouldn't see the initial start request for each lambda, but the subsequent click on "View attributes and reports" would generate its own.

twelch commented 1 month ago

Perhaps you had in mind reading the service manifest and calling all of the published gp function lambdas directly, so that any/all reports client should then be ready to load.

twelch commented 1 month ago

For any gp function that accepts extra runtime parameters (e.g. geographyId), it should run with its default values when one is not provided. Just to say that SeaSketch probably doesn't need to be aware of the extraParams.

twelch commented 1 month ago

If we did want to be able to disable this feature, project level or a little finer would be fine. I could see it being made available via the existing geoprocessing service configuration for a sketch class

underbluewaters commented 1 month ago

I'm thinking this would be something the server calls based on the service manifest. I'm now realizing it might not be that simple though. I guess the manifest potentially includes unused tasks, or tasks that apply to only sketches or only collections. Is the client the only piece that really knows what is appropriate to run?

twelch commented 1 month ago

Is the client the only piece that really knows what is appropriate to run?

More of this logic is held in the client these days yes. Without having had any need to expose it to externally to SeaSketch until now.

I guess the manifest potentially includes unused tasks

Unused tasks are possible, but not typical.

or tasks that apply to only sketches or only collections.

I always write gp functions that work with either a sketch or collection. Abby and Peter have followed along with that as far as I know. But you're right someone might not. I could add a new gp handler option like inputType: ("Sketch" | "SketchCollection")[] and expose in the service manifest. Without that, worst case is you invoke some lambdas with bad sketch input that errors and no one will ever see it.

Another thing to know about is worker functions, which we currently write and publish as sync geoprocessing functions. You shouldn't execute these directly. I could add an isWorker: boolean property to gp handler options. Or maybe go more direct and let the function publisher control when to pre-execute with a preCache: boolean gp handler option. Which could solve for the worker situation, the published but unused function situation, and any other situation that comes up in the future we haven't thought about.

underbluewaters commented 1 month ago

Sounds like opt-in might be the way to go here. We could have preCache be an enum like SKETCH | COLLECTION | NONE.

Along those lines and in the interests of having debugging control, would it make sense to just list of gp functions where sketch classes are configured in SeaSketch admin and have it be enabled there? In that case we'd still want an isWorker boolean to avoid listing those as options. Any preference for whether this is controlled in geoprocessing metadata vs in the SeaSketch admin interface?