Closed androuino closed 3 years ago
Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
.addTransform(new ToTensor())
.optSynsetUrl("https://mysynset.txt")
.build();
Criteria<Image, DetectedObjects> criteria = Criteria.builder() .setTypes(Image.class, DetectedObjects.class) .optModelUrls("file:///Users/home/mxnet/vgg16_atrous_custom") .optTranslator(translator)) .build();
3. SingleShotDetectionTranslator.Builder, accept synset input in three way:
.optSynsetArtifactName(String fileName): // this expect a uncompressed file in the model directory
.optSynsetUrl(String url): // this can download synset.txt from any url
.optSynset(String[] class): // you can pass classes directory to translator
4. optModelUrls() accept both local folder or an archive file (.zip/.tar), we don't expect files are compressed in model folder, DJL handle compressed file in the following case:
1. You created your own modelZoo and defines each file in metadata.json
2. You compressed your model directory into: .zip or .tar or .tar.gz file, DJL will uncompress the whole folder into cache directory automatically
5. When you specify optModelUrls() with only single url, you don't need define other filters or artificatId
6. We are working on an improvement to automatically create Translator, but this is still in progress.
Hi @frankfliu, thanks for the instruction I am now able to use the model I have trained, however, it looks like it's working but I am getting this result from the console:
[INFO ] - [
class: "pikachu", probability: 0.99466, bounds: [x=111.045, y=283.735, width=78.567, height=103.347]
class: "pikachu", probability: 0.99373, bounds: [x=176.363, y=246.798, width=81.334, height=120.723]
class: "pikachu", probability: 0.98868, bounds: [x=266.975, y=159.225, width=81.803, height=120.092]
class: "pikachu", probability: 0.98227, bounds: [x=376.517, y=251.875, width=81.409, height=120.386]
class: "pikachu", probability: 0.97852, bounds: [x=270.541, y=244.992, width=73.484, height=124.531]
]
but when I checked the image expecting with detected bbox drawn, there's not a single bbox drawn on the image. Could you please tell me why the output is not what I am expecting? Also, I have noticed that the value of x, y, width, and height is kind of a little off compared to the car, dog, bike model.
The model I just trained for testing has only 1 class which is pikachu with very few datasets to speed up the training and the backbone I have used is the ssd_512_vgg16_atrous_custom.
Thank you.
The bbox output in our car, dog, bike example, is the percentage of the image, all the values are between 0-1. I think your Translator didn't rescale the bbox. See: https://github.com/awslabs/djl/blob/master/api/src/main/java/ai/djl/modality/cv/translator/SingleShotDetectionTranslator.java#L61-L65
Do you have any advice or leads on how to fix that? Thanks.
Can you share your training code?
@frankfliu sure. here is the train_ssd.py gist > https://gist.github.com/androuino/4d8b552bb1d0474c58c40fe63c4409fe. This is how I hybridize the trained model > https://gist.github.com/androuino/f62c6878e1c3482b97874f15b34300aa
Thanks again.
Hi @frankfliu, any leads or updates? Thank you very much.
@androuino I think you need do the following:
Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
.optRescaleSize(300, 300)
.addTransform(new Resize(300, 300))
.addTransform(new ToTensor())
.optSynsetUrl("https://mysynset.txt")
.build();
The rescale will based on image size to map the bbox value to 0 - 1
Thanks @frankfliu, I will test it now and get back to you with the result.
Hello again @frankfliu, seems like it's working however, the result when I test the trained model using python, I am getting this result: while the detection in DJL is this:
I did the changes as you suggested:
List<String> classes = new ArrayList<>();
classes.add("pikachu");
Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
.optRescaleSize(300, 300)
.addTransform(new Resize(300, 300))
.addTransform(new ToTensor())
.optSynset(classes)
.build();
Criteria<Image, DetectedObjects> criteria = Criteria.builder()
.optApplication(Application.CV.OBJECT_DETECTION)
.setTypes(Image.class, DetectedObjects.class)
.optModelUrls("file:///vgg16_atrous_custom.tar.gz")
.optTranslator(translator)
.build();
And this is what I get from the console:
[INFO ] - Detected objects image has been saved in: build/output/detected-pikachu.png
[INFO ] - [
class: "pikachu", probability: 0.99218, bounds: [x=0.678, y=0.451, width=0.249, height=0.327]
class: "pikachu", probability: 0.99102, bounds: [x=0.467, y=0.442, width=0.245, height=0.324]
class: "pikachu", probability: 0.98329, bounds: [x=0.155, y=0.512, width=0.245, height=0.290]
class: "pikachu", probability: 0.97551, bounds: [x=0.638, y=0.042, width=0.241, height=0.305]
class: "pikachu", probability: 0.97120, bounds: [x=0.491, y=0.281, width=0.231, height=0.308]
]
I don't quite understand why there's a lot of bbox drawn on the image while on the console, it only has 5 bboxes.
@androuino You are print out the DetectedObjects using it's .toString() method, it only print out top 5 by default. But actually you got more items in the DetectedObjects
for pikachu model, you might want to set a higher threshold:
Translator<Image, DetectedObjects> translator = SingleShotDetectionTranslator.builder()
.optRescaleSize(300, 300)
...
.optThreshold(0.7)
Nice! it's working perfectly now. Thank you so much @frankfliu.
Question
Hi, I would like to ask how I can load a trained hybridize gluon model and created .params and .json file to be used in DJL? Should I create also the synset.txt file for the list of classes then compress it as a tar.gz file? I'd like to know what is the correct way to do it because from what I have tried, I have compressed these 3 files as a tar.gz file then from DJL, I've tried this code to load the model:
but I am getting this error instead:
Thanks in advance for your help.