camptocamp / inkmap

A library for generating high-quality, printable maps on the browser.
Other
86 stars 17 forks source link

Multiple calls to `queuePrint` result in same job id #56

Closed jussih closed 1 year ago

jussih commented 1 year ago

I have a dynamic number of map selections, which I queue for printing:

  // queuePrint gives the same id for all invocations
  const printJobIds = await Promise.all(
    selectionGeometries.map((polygon) =>
      queuePrint(createPrintSpec(polygon, [layer], size))
    )
  );
  const statusObservables: Observable<PrintStatus>[] = printJobIds.map((id) =>
    getJobStatus(id)
  );
  merge(...statusObservables)
    .pipe(finalize(() => pdf.download("map.pdf")))
    .subscribe((jobStatus) => {
      reportProgress(jobStatus);
      if (jobStatus.status === "finished" && jobStatus.imageBlob !== null) {
        pdf.addImageBlob(jobStatus.imageBlob);
      }
    });

The printJobIds returned are all the same, the id of the first job. Hence all statusObservables observe the same job. Looking at the "[inkmap] message to main:" console logs, all the jobs are properly queued and are processed, only the returned ids are incorrect from queuePrint.

The reason for this seems to be that calls to queuePrint all subscribe to the next value emitted from newJob$. It's a multicast observable and all the observers will take the first value it emits. Which is the id of the first job that the worker reports to main.

export function queuePrint(printSpec) {
  messageToPrinter(MESSAGE_JOB_REQUEST, { spec: printSpec });
  return newJob$
    .pipe(
      take(1),
      map((job) => job.id)
    )
    .toPromise();
}

The asynchronicity of this makes it a bit complicated, I guess the easiest option to solve this would be to manage the job ids in the main thread and report the new id eagerly back to the caller. The message to the worker would then include the spec and the job id, which the worker will use for reporting back. Perhaps the id should be a UUID to avoid keeping state for the counter in this case.

jussih commented 1 year ago

Of course turning the job id into a UUID would be a breaking change. To avoid that queuePrint could generate a unique token which would be sent back from the worker. newJob$ could then be filtered by this token, so every promise would return the job id corresponding to their own request.

The unique token could even be a hash of the spec object since it's already sent back from the worker, thus requiring no changes in worker side.

Something like this:

export function queuePrint(printSpec) {
  messageToPrinter(MESSAGE_JOB_REQUEST, { spec: printSpec });
  const specHash = hashFunc(printSpec);
  return newJob$
    .pipe(
      filter(job => hashFunc(job.spec) ===  specHash),
      take(1),
      map((job) => job.id)
    )
    .toPromise();
}

What do you think? Calling queuePrint twice with the same spec would still result in getting the same job id for both, but that shouldn't matter since it's the same request anyway.

jussih commented 1 year ago

Submitted an example solution as PR - since generating a reliable hash from an object is not trivial, it's simpler to just use a (pseudo)random uuid to match the queue request and job. Tested this locally and my example code with merged observables is now working correctly with this.

There are now basically 2 duplicate job identifiers, jobId and jobContext. Code would be simpler if jobId was replaced by the main-generated uuid, but this allows for backwards compatibility. Otherwise you would have to bump to 2.x :)

jahow commented 1 year ago

Hey @jussih, thanks for the detailed write-up and explanation. This is indeed something that was overlooked when writing the logic for creating new jobs on the printer thread; it was just assumed that listening to the first "new job" coming from the printer would always give us the one we expected.

I agree that your approach is the best, no change at all to the API and a minimal impact on the codebase. I still like that the printer thread is responsible for giving jobs their identifier. On the main thread using an object hash is probably overkill IMO, so a random identifier sounds like the best way forward.

I'm currently reviewing your PR and considering different options