xuebinqin / U-2-Net

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Apache License 2.0
8.64k stars 1.49k forks source link

Core ML format #118

Open ivptr opened 3 years ago

ivptr commented 3 years ago

Is there an Apple's Core ML format of U^2-Net model?

umerzia-7001 commented 3 years ago

I tried converting UNet model to coreml format but there is some issue . It is displaying black image in IoS app . When I run prediction on the converted model it gives me no input layer in neural network error. Any solution or coreml model ?

BradF-99 commented 3 years ago

Would like an update on this as well, if possible. I've also tried converting the model to CoreML 5 using coremltools, but it appears to produce random results:

Screen Shot 2020-12-13 at 10 47 58 pm

I'm not sure if the issue is in the conversion, or with CoreML itself - I've attached the Jupyter Notebook I created in order to attempt conversion of the model.

umerzia-7001 commented 3 years ago

Hi Brad Could you be kind enough to provide me the coreml mode you got for the app. Thnks in advance

On Sun, 13 Dec 2020 at 5:58 PM, Brad Fuller notifications@github.com wrote:

Would like an update on this as well, if possible. I've also tried converting the model to CoreML 5 using coremltools, but it appears to produce random results:

[image: Screen Shot 2020-12-13 at 10 47 58 pm] https://user-images.githubusercontent.com/37586752/102012403-cfc2be80-3d95-11eb-9743-17a909b027d2.png

I'm not sure if the issue is in the conversion, or with CoreML itself - I've attached the Jupyter Notebook I created in order to attempt conversion of the model. https://github.com/NathanUA/U-2-Net/files/5684595/U2NetCoreMLNotebook.zip

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/118#issuecomment-744003999, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANJHLR6JNAX6AZFJWCXZ4Q3SUS25TANCNFSM4UPGOBEA .

BradF-99 commented 3 years ago

Hi Muhammad,

I've attached the converted U^2net lite model from the Jupyter Notebook. CoreML Tools detected all 7 outputs and flattened them, marking them as MLMultiArray outputs - I've set up extra layers to convert all 7 outputs in to grayscale 320x320 images. The out_p0 output layer is equivalent to d1 (pre-normalisation) from the original model. If needed I can remove the extra layers and provide a model that outputs the default MLMultiArray (which appear to be 1x1x320x320 shape). u2netpconv.mlmodel.zip

umerzia-7001 commented 3 years ago

I need output dimension of MultiArray(Int32 513 x 513) type . Could you do that ? Input also 513 x 513 dimention

On Sun, 13 Dec 2020 at 7:25 PM, Brad Fuller notifications@github.com wrote:

Hi Muhammad,

I've attached the converted U^2net lite model from the Jupyter Notebook. CoreML Tools detected all 7 outputs and flattened them, marking them as MLMultiArray outputs - I've set up extra layers to convert all 7 outputs in to grayscale 320x320 images. The out_p0 output layer is equivalent to d1 (pre-normalisation) from the original model. If needed I can remove the extra layers and provide a model that outputs the default MLMultiArray (which appear to be 1x1x320x320 shape). u2netpconv.mlmodel.zip https://github.com/NathanUA/U-2-Net/files/5684709/u2netpconv.mlmodel.zip

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/118#issuecomment-744015437, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANJHLR4IDIOAECMVJ2JIP6LSUTFHLANCNFSM4UPGOBEA .

umerzia-7001 commented 3 years ago

These dimensions and types

On Sun, 13 Dec 2020 at 7:32 PM, Muhammad Umer Zia Muhammad Umer < mzia.bese19seecs@seecs.edu.pk> wrote:

I need output dimension of MultiArray(Int32 513 x 513) type . Could you do that ? Input also 513 x 513 dimention

On Sun, 13 Dec 2020 at 7:25 PM, Brad Fuller notifications@github.com wrote:

Hi Muhammad,

I've attached the converted U^2net lite model from the Jupyter Notebook. CoreML Tools detected all 7 outputs and flattened them, marking them as MLMultiArray outputs - I've set up extra layers to convert all 7 outputs in to grayscale 320x320 images. The out_p0 output layer is equivalent to d1 (pre-normalisation) from the original model. If needed I can remove the extra layers and provide a model that outputs the default MLMultiArray (which appear to be 1x1x320x320 shape). u2netpconv.mlmodel.zip https://github.com/NathanUA/U-2-Net/files/5684709/u2netpconv.mlmodel.zip

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/118#issuecomment-744015437, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANJHLR4IDIOAECMVJ2JIP6LSUTFHLANCNFSM4UPGOBEA .

BradF-99 commented 3 years ago

Unfortunately I can't change the input and output sizes, so you may have to do upsampling on the output (I believe the U^2net test Python script uses bilinear upsampling). However, this model accepts a colour image input of 513x513, and supplies outputs as 320x320 MLMultiArray - the "out_a0" output is equivalent to d1. I haven't been able to test it unfortunately, and I still haven't been able to get U^2net full or lite working correctly in CoreML, so it's likely that this model won't work correctly either.

u2netp513.mlmodel.zip

MahmoudSalem92 commented 3 years ago

Hi Brad, Thank you very much for sharing the model. could you share the updated script. Regards

ldenoue commented 3 years ago

@BradF-99 and @MahmoudSalem92 when you run the CoreML model on an iOS device, how fast does it predict? Is it real-time or several seconds per image?

BradF-99 commented 3 years ago

@BradF-99 and @MahmoudSalem92 when you run the CoreML model on an iOS device, how fast does it predict? Is it real-time or several seconds per image?

In my testing, I don't think my CoreML converted model would be able to be predict in real-time, it was around a second or two per image. However, I'm using an iPhone X, and I've heard that the Neural Engine in the A11 is not as fast as the newer A-series iterations, and I am not using Obj-C or Swift in my app so it's likely that there are some inefficiencies there.

It still doesn't get the result right either - maybe it needs some sort of pre-processing on the image. I've just returned from a holiday so I'll look in to it when I have time.

MahmoudSalem92 commented 3 years ago

@BradF-99 Thank you very much for your response. Can you upload the code of the generated model? Does it developed based on python?

BradF-99 commented 3 years ago

Hi @MahmoudSalem92, the model that was generated is here - this one converts the output to an image, and this one leaves it as an MLMultiArray. As for the conversion I used a Jupyter Notebook.

MahmoudSalem92 commented 3 years ago

Hi @BradF-99 , thank you very much for your response. the code you shared is generate random results. Do you have the code of this model u2netpconv.mlmodel

BradF-99 commented 3 years ago

As I stated above I wasn't successful in conversion, clearly it's missing some image processing somewhere. Maybe I've missed something, feel free to look through the Jupyter Notebook I attached.

jb-apps commented 3 years ago

Hi @BradF-99 I am getting exactly the same result as you, I think we have to do some preprocessing to the inputs but I haven't been able to figure it out. Have you been able to find out? Thanks!

BradF-99 commented 3 years ago

Hi @jb-apps, I did notice that the original Pytorch model requires images in a "BRG" format as opposed to BGR or RGB - maybe this is the pre-processing my CoreML model is missing? Let me know if you have any success, I haven't looked in to this for a while.

jb-apps commented 3 years ago

Thanks for the quick response @BradF-99, according to this the inputs should be RGB! We may need to do some pre-processing. I am still new with coreml, will keep investigating.

yamamon0402 commented 3 years ago

Hi , @BradF-99 , Thank you very much for your Jupiter notebook.It became very helpful. In my case , It was better if I fixed it as follows.Just add scale param.

cat
iTarek commented 3 years ago

Hi , @BradF-99 , Thank you very much for your Jupiter notebook.It became very helpful. In my case , It was better if I fixed it as follows.Just add scale param.

  • before
model = ct.convert(traced_model, inputs=[ct.ImageType(name="input_1", shape=example_input.shape)])
  • fixed
model = ct.convert(traced_model, inputs=[ct.ImageType(name="input_1", shape=example_input.shape,scale=1/255.0)])
cat

Hi I'm facing an issue in converting the model. Can you please share the mlmodel?

Djeevs commented 3 years ago

Hi , @BradF-99 , Thank you very much for your Jupiter notebook.It became very helpful. In my case , It was better if I fixed it as follows.Just add scale param.

  • before
model = ct.convert(traced_model, inputs=[ct.ImageType(name="input_1", shape=example_input.shape)])
  • fixed
model = ct.convert(traced_model, inputs=[ct.ImageType(name="input_1", shape=example_input.shape,scale=1/255.0)])
cat

Can you please share the full script you used for conversion? I looked everywhere and couldn't find any fully working solution. Also do you guys use Big Sur? I have some problems with Catalina, looks like you need CoreML 5 to change nodes.

Anandh-iOS commented 3 years ago

Hi @MahmoudSalem92, the model that was generated is here - this one converts the output to an image, and this one leaves it as an MLMultiArray. As for the conversion I used a Jupyter Notebook.

Hi Brad,

Screenshot 2021-06-08 at 6 32 03 PM

I am getting this error after conversion using the Jupiter notebook, kindly guide me whats wrong here.

whiteio commented 3 years ago

@iTarek @Djeevs @Anandh-iOS This should work.


import coremltools as ct
from model import U2NET

net = U2NET(3,1)

net.load_state_dict(torch.load("u2net.pth", map_location=torch.device('cpu')))
net.eval()

example_input = torch.rand(1, 3, 512, 512)
traced_model = torch.jit.trace(net, example_input)

converted_model = ct.convert(
    traced_model, 
    inputs=[ct.ImageType(name="input_1", shape=example_input.shape,scale=1/255.0)]
)

converted_model.save('U2Net.mlmodel')
Anandh-iOS commented 3 years ago

Hi @whiteio,

Thanks for your response, It works and converts to .mlmodel. But I get only a black image on prediction. //Swift let resized_image = image?.scaleImage(targetSize: CGSize(width: 512, height: 512)) let buffer = resized_image?.cgImage?.pixelBuffer() let result = try? u2p!.prediction(input_1: buffer!)
let image: UIImage = createUIImage(fromFloatArray: try! result!._2169.reshaped(to: [1,512,512]), min: 0, max: 255)!

Kindly share if you find anything wrong here.

Thanks, Anandh

LA-Labs commented 3 years ago

Just created a fully working demo: https://github.com/SmartyToe/Image-segmentation. Enjoy.

Anandh-iOS commented 3 years ago

Hi @LA-Labs,

Thanks for the converted MLModel, it works. If possible please share how you converted the pytorch to .mlmodel.

Thanks.

iTarek commented 3 years ago

Just created a fully working demo: https://github.com/SmartyToe/Image-segmentation. Enjoy.

You are the best, Thank you so much for this amazing demo. Just when compare U^2-Net with DeepLapV3 for removing background, DeepLapV3 is better, is that normal or this version of U^2-Net is not trained version?

mgstar1021 commented 3 years ago

Just created a fully working demo: https://github.com/SmartyToe/Image-segmentation. Enjoy.

Thanks for your demo! It was helpful for me. However it works for only square image and doesn't support for any ratio. for example, it works for 320320 but doesn't work for 320 300. What is the solution?

duncanwilcox commented 3 years ago

First, many many thanks to @xuebinqin for sharing this very good model with a liberal license.

I'm not a ML expert and not a python expert, but I am persistent. Turn out nothing quite worked for me with current coreml tools (5.0b3), except the @LA-Labs model, but I wanted to use the full U2Net instead of the U2NetP model.

Not being an ML expert makes many things baffling to me:

Frankly it feels like the tooling is sooo primitive -- but this is just a general comment on python/pytorch/ml, the results speak for themselves.

Anyway.

Apologies if this is non idiomatic python or non idiomatic ML. I'm not an expert but I am persistent. I think macOS/iOS developers can benefit from my findings, so here goes.

Using @BradF-99's Jupyter Notebook (another baffling tech) I derived the attached script that appears to work and produce an mlmodel that Xcode will accept:

u2net-to-mlmodel.py.zip

The relevant app code to apply the mask (objective-c on macos, should be easy enough to use with swift and ios):

    NSError *err = nil;
    u2netOutput *result = [model predictionFromInput:pixelbuffer error:&err];
    CVPixelBufferRef data = [result featureValueForName:@"out_p0"].imageBufferValue;

    CIImage *maskimg = [[CIImage alloc] initWithCVPixelBuffer:data];

    CIFilter<CIBicubicScaleTransform> *scale = CIFilter.bicubicScaleTransformFilter;
    [scale setDefaults];
    scale.inputImage = maskimg;
    scale.parameterB = 0;
    scale.parameterC = 0.75;
    scale.scale = inputImage.extent.size.height / 320.;
    scale.aspectRatio = inputImage.extent.size.width / inputImage.extent.size.height;

    CIFilter<CIBlendWithMask> *blend = CIFilter.blendWithMaskFilter;
    [blend setDefaults];
    blend.inputImage = inputImage;
    blend.maskImage = scale.outputImage;

    CIImage *composite = blend.outputImage;

Hope this helps.