josephch405 / jit-masker

20 stars 8 forks source link

Very poor result #6

Open rose-jinyang opened 3 years ago

rose-jinyang commented 3 years ago

Hello How are you? Thanks for contributing this paper and project. I tested this project with several images and web camera. The result is very bad.

image

image

The right images are the mask images resulted from the pre-trained jitnet model by u. I tested with several input sizes (128x128, 224x224, 512x512). The paper said that this method outperforms U2-Net method in accuracy and speed. image

But these results are so bad. U2-Net model is slow but very accuracy. How can I understand this?

josephch405 commented 3 years ago

Hi - We didn't train the starter JIT-Net model for very long and so you'd have to run this on a video input in order for the distillation effects to manifest. Therefore, as expected, single shot inference on just the image itself does quite poorly. You would need to train the small JIT-Net model on a better corpus of facial images if you want a better pretrained model (we didn't get around to this in time).