testdotai / appium-classifier-plugin

Apache License 2.0
259 stars 78 forks source link

Performance and flexibility improvement by Single Shot MultiBox Detector #3

Open NozomiIto opened 5 years ago

NozomiIto commented 5 years ago

I understand the current implementation retrieves all "//*[not(child::*)]" elements (which means all elements without children?) by XPath and predicts each category of the element image in parallel. I think this approach is very simple and stable, but we can improve this logic more by using SSD(Single Shot MultiBox Detector). SSD detects multiple object classes and locations in an image at one time (like this).

I'm afraid the current implementation cannot recognize object which does not have the corresponding Appium element. On the other hand, SSD model can detect any object in an image even if the object does not have the corresponding Appium element. This is quite useful for Unity app, WebView without permission etc, which Appium often cannot handle well.

I found tensorFlowJS has SSD sample model, and I think this can be used to this AI locator.

The problems which we should overcome are:

I think this is the next step for this locator to become the true AI locator which behaves exactly the same as the human and does not require any system information and permission just like the human tester.

I'd like to know what do you think about this improvement.

jlipps commented 5 years ago

Hi @NozomiIto these are really interesting thoughts, and I think your assessment of the benefits and drawbacks of the SSD approach is accurate. One possibility is not returning full-blown Appium elements, but returning ImageElements instead, and all we need for them is the screenshot bounds of the element.

I haven't used the SSD approach before myself so I'm not sure how to go about prototyping it, but I'd be happy to assist if you want to help with a proposal @NozomiIto!

NozomiIto commented 5 years ago

Thanks @jlipps !

ImageElements

Oh, sounds good! I didn't know that element! It is used for the image locator, isn't it?

I'd be happy to assist if you want to help with a proposal @NozomiIto!

Yeah, sure. I'm familiar with Python (tensorflow or caffe) SSD, but I have never used tensorFlowJS and its SSD model. I will take a look at tensorFlowJS SSD and check how it can be integrated to Appium.