marieai / marie-ai

Integrate AI-powered Document Analysis Pipelines
MIT License
61 stars 5 forks source link

Size mismatch for generated assets #57

Closed gregbugaj closed 1 year ago

gregbugaj commented 1 year ago

We have a size mismatch when generating document assets that undergo 'document cleanup' step. This happens due to ensuring that the document is a modulo 32 .

Image of size 2560 x 2384 will become 2592 x 2400.

    def preprocess(self, img: np.ndarray) -> np.ndarray:
        # check if PIL image
        if not isinstance(img, Image.Image):
            img = Image.fromarray(img)

        # make sure image is divisible by 32
        ow, oh = img.size
        base = 32
        if ow % base != 0 or oh % base != 0:
            h = oh // base * base + base
            w = ow // base * base + base
            img = img.resize((w, h), Image.LANCZOS)

        # convert to numpy array
        return np.array(img)

Not only that but we are performing an resize instead of overlay. So we are effectively modifying the original image.

Expectation is that after running segment_frame the input and output images will be of this same size.

  real, mask, blended = self.overlay_processor.segment_frame(
      doc_id, frame
  )