leo-p / papers

Papers and their summary (in issue)
22 stars 4 forks source link

pix2code: Generating Code from a Graphical User Interface Screenshot #36

Open leo-p opened 7 years ago

leo-p commented 7 years ago

https://arxiv.org/pdf/1705.07962.pdf

Transforming a graphical user interface screenshot created by a designer into computer code is a typical task conducted by a developer in order to build customized software, websites and mobile applications. In this paper, we show that Deep Learning techniques can be leveraged to automatically generate code given a graphical user interface screenshot as input. Our model is able to generate code targeting three different platforms (i.e. iOS, Android and web-based technologies) from a single input image with over 77% of accuracy.

leo-p commented 7 years ago

Summary:

Inner-workings:

Decomposed the problem in three steps:

  1. a computer vision problem of understanding the given scene and inferring the objects present, their identities, positions, and poses.
  2. a language modeling problem of understanding computer code and generating syntactically and semantically correct samples.
  3. use the solutions to both previous sub-problems by exploiting the latent variables inferred from scene understanding to generate corresponding textual descriptions of the objects represented by these variables.

They also introduce a Domain Specific Languages (DSL) for modeling purposes.

Architecture:

screen shot 2017-06-16 at 11 34 28 am

Results:

Clearly not ready for any serious use but promising results!

screen shot 2017-06-16 at 11 57 45 am