Ameobea / orange3

Orange 3 data mining suite: http://orange.biolab.si
Other
1 stars 0 forks source link

Daily Progress Updates #23

Closed ameo-unito-bot closed 8 years ago

ameo-unito-bot commented 8 years ago

┆Issue is synchronized with this Asana task

Ameobea commented 8 years ago

I discovered that the source of the discrepancy between the results in the script and native Orange in owoutliers was due to a mixup with the preprocessors attribute. I've fixed this by requiring all learners to include their preprocessors every time, even if they are default.

@Pelonza I created a spreadsheet to help me keep track of what still needs to be done in terms of code generator creation: https://docs.google.com/spreadsheets/d/1YpGQtxAKw4Am6vmRgAefz9Mwknrl7ITVLLurJG2yfWo/edit?usp=sharing

I plan on making most of the Visualize widgets using the widget instance method since creating code to generate matplotlib or other external plotting library plots would be a large project in itself. For Classify and Regression, most of the work is done through the __repr__ code which has already been implemented.

For Data, Unsupervised, and Evaluate, the work will be more manual but should progress at a reasonable rate thanks to the good core code generation engine that is currently implemented. I will continue posting progress updates for my work here and will make any changes you ask as work continues. I still aim to have this mostly if not totally finished by the beginning of the school year.

Ameobea commented 8 years ago

I created code generators for owimpute, oweditdomain, and owdatasampler. owdatasampler required me to convert some sampler functions into classes that had __init__ and __call__ functions. This allowed me to create __repr__ functions for them as well which made the generated code 5 times smaller and easier to understand.

I've cherry-picked the repr changes into the repr branch. @kernc what would be the best time to aim to create a PR for the repr changes? I continue to add to it as I develop the code generator and individual code generator functions for widgets. However, I could submit future __repr__-related changes directly to the code-generator branch and include them in the main code generator PR once my work is finished.

kernc commented 8 years ago

to help me keep track of what still needs to be done

Add unit tests on your todo list.

what would be the best time to aim to create a PR for the repr changes?

ASAP. If you continue to work on it, tag it [WIP]. The way you propose could also work. Either case, we get to see your changes early.

Ameobea commented 8 years ago

I honestly have no idea how to even approach writing tests for this. I've never written official unit tests for anything in my life, almost exclusively working on projects on where I was the only developer. I'm not asking you to teach me, but do you have any links to guides or reference for Orange's test framework or some standard for how to do it?

I'll submit the repr PR now and include any other changes I make related to it directly in the code generation PR, whenever that happens.

kernc commented 8 years ago

This looks like a decent intro:https://jeffknupp.com/blog/2013/12/09/improve-your-python-understanding-unit-testing/ For more, see examples in dirs named tests.

Ameobea commented 8 years ago

I assume I should write tests for all the repr changes as well?

kernc commented 8 years ago

Some sensible tests in the repr PR, sure. If you make a single __repr__ function relying on the assumption presented https://github.com/Ameobea/orange3/issues/24, there will be much fewer lines to cover/test.