Below is a summery of the discussion me, arturo, gabby,wibin and chats had during a few meetings we had regarding this topic.

Possible Approach discussed

Workflow:

The user uploads a set of wire frame into a prototype project in BUILD.
For each wireframe uploaded an image page is created and asset identified with (Project Id, Page Id and Asset Id).
After the image is saved in database, a background process is initiated with the asset identifiers to the wireframe analysis engine.
The entire image is analyzed(using ML technique we will use for identifying controls in step 8) to identify the suitable device layout (Desktop, Tablet, Phone). Then the corresponding gray-scale image with noise reduction in fixed resolution (1280 X 1024) is generated. This gray-scale image is provided as input to next three steps processed in parallel.
The image is analyzed using Optical Character Recognition system to identify the text and their position in the image. Identified following open source solution OCRopus (https://github.com/tmbdev/ocropy) Tesseract (https://github.com/tesseract-ocr/tesseract ) which can be modified to get the desired output.
The image is analyzed to break the wireframe into small segments each containing a User Component. Identified two image processing methods canny edge detection with contour identification http://docs.opencv.org/trunk/da/d22/tutorial_py_canny.html http://docs.opencv.org/trunk/d4/d73/tutorial_py_contours_begin.html watershed method http://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_watershed/py_watershed.html Another possibility as divesh suggested could be we ask the user to manually mark the different objects in the wireframe through hot-spot like feature.
As the image is broken down different property associated with a component like position in the page, width, height etc. are deducted and passed to the metadata generator.
After the components are separated they are passed through an identifier to identify the type of the component. Identified two open source projects that are used for image recognition using machine learning techniques like neural networks. a. Tensor flow - https://github.com/tensorflow/tensorflow (https://www.tensorflow.org/) b. Caffe - https://github.com/BVLC/caffe (http://caffe.berkeleyvision.org/)
The metadata generator receives all data and combines them to create a BSON object in pageMetadata format and persist in MongoDB.
After the metadata is made available in database a user gets the option to add the generated page in the prototype editor of BUILD in meatball menu, and using the create page flow we can add a new object page into the project.

In my opinion it will be better to start processing for all image page when uploaded. And only enable the option to generate hi-fidelity page if we are able to generate metadata for an image page. As the entire process is going to be time consuming (probably more than 1 minute) and it will be bad if we make the user wait for so long and then come up with nothing.

SubhasisDutta / Wireframe-Identification-Engine

Create a post for build team #7

Similar solution available

Possible Approach discussed