Ameobea / orange3

Orange 3 data mining suite: http://orange.biolab.si
Other
1 stars 0 forks source link

Daily Progress Updates #4

Closed ameo-unito-bot closed 8 years ago

ameo-unito-bot commented 8 years ago

┆Issue is synchronized with this Asana task

Ameobea commented 8 years ago

Good things have been happening so far.

I've decided to convert the code generator into a much more advanced tool than I had previously envisioned. Instead of taking three blocks of code as input, converting them into strings, and shoving the output into a file, it now allows for fine-tuned manipulation of the output.

All widgets are now represented by classes in the output code. These classes are basically trimmed down versions of the widget code that only contain what functionality is necessary to take in input from channels and produce output.

The code generator now has a much more expansive list of data types and places where it can put that data.

  1. Preamble: Still goes at the beginning of the script before any widget code. It takes in external dependency objects and generates an import statement for them using the __package__ and __module__ attributes. I plan on adding functionality for manual imports in case of cases where automatic import generation don't work.
  2. Attribute Declarations: These are functions and variables that are defined as attributes of the generated class. For the most part, they are simply attributes of the widget class copied over into the output. I made it easy to copy attributes between the widget class and the generated class by simply adding their name to a list provided to the attribute generator function.
  3. **init Declarations**: These are, as their name suggests, declarations that go inside the generated class's __init__. They are for stuff like setting up internal data structures required for imported functions.
  4. External Function/Variables: These are for declared functions in the script that are declared in the widget file outside of the widget class. They are inserted in the same area as the output class but before it.
  5. Body: This is the core function that is called when output is requested from the widget. It is integrated with an overridden send() function to enable widget output to be captured and sent off to other widgets in the script.

I also added limited raw code editing capabilities through the use of null_ln() which simply removes any lines in the output code with a reference to a certain variable. This is designed to trim down the exported code and remove stuff like code that updates the widget GUI but still sits inside a core widget function.

I've got basic script exporting working (mostly broken with some parts entirely non-functional) for the owfile widget. I plan on getting this widget 100% functional and then moving on to other widgets, extending the code generator class as I need to meet the needs of other widgets.

Ameobea commented 8 years ago

I've finished hammering out the majority of the code generator's main sub-generator functions. Currently implemented are attributes, init, and external functions. I've decided that the generator is in good enough shape to provide an example of generated code as it currently exists:

https://ameo.link/u/bin/2in

and here is the code generation function from owfile.py:

https://ameo.link/u/bin/2io

This is the output from running the code generator on an Orange project consisting of 6 widgets, two of which are owfile. As can be seen in the generated output, widget classes are created along with __init__ functions that contain the necessary declarations to enable the widget to perform its operations.

The code in the actual widget file totals 40 lines, of which 16 are comments or whitespace; all other generation code is contained in the codegen module in utils.

There is one problem, however. It's impossible to insert arbitrarily selected code from the class that isn't isolated in a function. For example, these lines:

gen.add_init("recent_paths", "Setting([\n" +
            indent(3, "RecentPath(\"\", \"sample-datasets\", \"iris.tab\"),\n") +
            indent(3, "RecentPath(\"\", \"sample-datasets\", \"titanic.tab\"),\n") +
            indent(3, "RecentPath(\"\", \"sample-datasets\", \"housing.tab\"),\n") +
            indent(2, "])"), iscode=True)

had to be added to the widget's code generator init function to give the widget the value of self.recent_paths, a Settings object. I have code that converts easily stringifiable data types (int, float, tuple) into declarations automatically, but for data of other types it's impossible. Possible solutions for this include writing search functions to, for example, include the first line that contains a string in the output code or insert lines by line number. However, that is a messy approach that makes forward-compatibility with changes to the widget code difficult.

Another solution I have for this is to create a function that performs all initializations and insert THAT into the generated code. The issue with that is that code has to be copy-pasted in order to be processed.

The final solution could be to change the actual widget code and move the necessary declarations into function(s) that make it easier for the code generator to process them. I've refrained so far from touching any of the native widget code, keeping my edits isolated to new functions only, but this may be the best way to deal with the problem.