Loading trained model with custom dataset

androuino commented 3 years ago

Question

Hi, I would like to ask how I can load a trained hybridize gluon model and created .params and .json file to be used in DJL? Should I create also the synset.txt file for the list of classes then compress it as a tar.gz file? I'd like to know what is the correct way to do it because from what I have tried, I have compressed these 3 files as a tar.gz file then from DJL, I've tried this code to load the model:

Criteria<Image, DetectedObjects> criteria =
                Criteria.builder()
                        .optApplication(Application.CV.OBJECT_DETECTION)
                        .setTypes(Image.class, DetectedObjects.class)
                        .optModelUrls("file:///Users/home/mxnet/vgg16_atrous_custom")
                        .optFilter("backbone", "vgg16_atrous_custom")
                        .optModelName("vgg16_atrous_custom")
                        .optProgress(new ProgressBar())
                        .build();

but I am getting this error instead:

Exception in thread "main" ai.djl.repository.zoo.ModelNotFoundException: No matching model with specified Input/Output type found.
    at ai.djl.repository.zoo.ModelZoo.loadModel(ModelZoo.java:173)
    at ai.djl.examples.inference.ObjectDetection.predict(ObjectDetection.java:66)
    at ai.djl.examples.inference.ObjectDetection.main(ObjectDetection.java:48)

Thanks in advance for your help.

frankfliu commented 3 years ago

How to convert gluon model to symbolic: http://docs.djl.ai/master/docs/mxnet/how_to_convert_your_model_to_symbol.html

If you load a model directly from an URL, in most cases, you need supply a Translator:


Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
        .addTransform(new ToTensor())
        .optSynsetUrl("https://mysynset.txt")
        .build();

Criteria<Image, DetectedObjects> criteria = Criteria.builder() .setTypes(Image.class, DetectedObjects.class) .optModelUrls("file:///Users/home/mxnet/vgg16_atrous_custom") .optTranslator(translator)) .build();


3. SingleShotDetectionTranslator.Builder, accept synset input in three way:

.optSynsetArtifactName(String fileName): // this expect a uncompressed file in the model directory
.optSynsetUrl(String url): // this can download synset.txt from any url
.optSynset(String[] class): // you can pass classes directory to translator



4. optModelUrls() accept both local folder or an archive file (.zip/.tar), we don't expect files are compressed in model folder, DJL handle compressed file in the following case:
    1. You created your own modelZoo and defines each file in metadata.json
    2. You compressed your model directory into: .zip or .tar or .tar.gz file, DJL will uncompress the whole folder into cache directory automatically

5. When you specify optModelUrls() with only single url, you don't need define other filters or artificatId

6. We are working on an improvement to automatically create Translator, but this is still in progress.

androuino commented 3 years ago

Hi @frankfliu, thanks for the instruction I am now able to use the model I have trained, however, it looks like it's working but I am getting this result from the console:

[INFO ] - [
    class: "pikachu", probability: 0.99466, bounds: [x=111.045, y=283.735, width=78.567, height=103.347]
    class: "pikachu", probability: 0.99373, bounds: [x=176.363, y=246.798, width=81.334, height=120.723]
    class: "pikachu", probability: 0.98868, bounds: [x=266.975, y=159.225, width=81.803, height=120.092]
    class: "pikachu", probability: 0.98227, bounds: [x=376.517, y=251.875, width=81.409, height=120.386]
    class: "pikachu", probability: 0.97852, bounds: [x=270.541, y=244.992, width=73.484, height=124.531]
]

but when I checked the image expecting with detected bbox drawn, there's not a single bbox drawn on the image. Could you please tell me why the output is not what I am expecting? Also, I have noticed that the value of x, y, width, and height is kind of a little off compared to the car, dog, bike model.

The model I just trained for testing has only 1 class which is pikachu with very few datasets to speed up the training and the backbone I have used is the ssd_512_vgg16_atrous_custom.

Thank you.

frankfliu commented 3 years ago

The bbox output in our car, dog, bike example, is the percentage of the image, all the values are between 0-1. I think your Translator didn't rescale the bbox. See: https://github.com/awslabs/djl/blob/master/api/src/main/java/ai/djl/modality/cv/translator/SingleShotDetectionTranslator.java#L61-L65

androuino commented 3 years ago

Do you have any advice or leads on how to fix that? Thanks.

frankfliu commented 3 years ago

Can you share your training code?

androuino commented 3 years ago

@frankfliu sure. here is the train_ssd.py gist > https://gist.github.com/androuino/4d8b552bb1d0474c58c40fe63c4409fe. This is how I hybridize the trained model > https://gist.github.com/androuino/f62c6878e1c3482b97874f15b34300aa

Thanks again.

androuino commented 3 years ago

Hi @frankfliu, any leads or updates? Thank you very much.

frankfliu commented 3 years ago

@androuino I think you need do the following:

Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
            .optRescaleSize(300, 300)
            .addTransform(new Resize(300, 300))
            .addTransform(new ToTensor())
            .optSynsetUrl("https://mysynset.txt")
            .build();

The rescale will based on image size to map the bbox value to 0 - 1

androuino commented 3 years ago

Thanks @frankfliu, I will test it now and get back to you with the result.

androuino commented 3 years ago

Hello again @frankfliu, seems like it's working however, the result when I test the trained model using python, I am getting this result: pikachu while the detection in DJL is this: detected-pikachu

I did the changes as you suggested:

List<String> classes = new ArrayList<>();
classes.add("pikachu");
Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
        .optRescaleSize(300, 300)
        .addTransform(new Resize(300, 300))
        .addTransform(new ToTensor())
        .optSynset(classes)
        .build();

 Criteria<Image, DetectedObjects> criteria = Criteria.builder()
        .optApplication(Application.CV.OBJECT_DETECTION)
        .setTypes(Image.class, DetectedObjects.class)
        .optModelUrls("file:///vgg16_atrous_custom.tar.gz")
        .optTranslator(translator)
        .build();

And this is what I get from the console:

[INFO ] - Detected objects image has been saved in: build/output/detected-pikachu.png
[INFO ] - [
    class: "pikachu", probability: 0.99218, bounds: [x=0.678, y=0.451, width=0.249, height=0.327]
    class: "pikachu", probability: 0.99102, bounds: [x=0.467, y=0.442, width=0.245, height=0.324]
    class: "pikachu", probability: 0.98329, bounds: [x=0.155, y=0.512, width=0.245, height=0.290]
    class: "pikachu", probability: 0.97551, bounds: [x=0.638, y=0.042, width=0.241, height=0.305]
    class: "pikachu", probability: 0.97120, bounds: [x=0.491, y=0.281, width=0.231, height=0.308]
]

I don't quite understand why there's a lot of bbox drawn on the image while on the console, it only has 5 bboxes.

frankfliu commented 3 years ago

@androuino You are print out the DetectedObjects using it's .toString() method, it only print out top 5 by default. But actually you got more items in the DetectedObjects

for pikachu model, you might want to set a higher threshold:

Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
        .optRescaleSize(300, 300)
        ...
        .optThreshold(0.7)

androuino commented 3 years ago

Nice! it's working perfectly now. Thank you so much @frankfliu.

deepjavalibrary / djl

Loading trained model with custom dataset #220

Question