Closed simonprovost closed 1 year ago
correct? I believe that I am discussing about what's here (same applies for TransformerMixin though).
Yes, that should work. As you might have noticed, it's a bit finicky if the algorithm is both Transformer and Classifier, in which case it only gets picked up as the first one. Should be easy to adapt the parser to handle this case, but it's something that did not come up (yet).
there should be no cause for concern, correct?
I can't readily think of a reason it shouldn't work. Perhaps just give it a go :)
I believe I need only create additional functions in that script and invoke them when creating the OperatorSet().
I think that's a good starting point.
tricking the creation of the operator's set in the initialisation of this class with my new function to create an individual, would you agree?
Yes, it probably makes sense to write a subclass that overwrites only those things that are different and go from there. After it's clear what the changes are, it's easier to review if it's reasonable to refactor out inheritance and handle it by other means.
If I wanted to add the extra-needed parameter...
I think you found the relevant places considering you are not looking at exporting the models (right now).
Is there a way to disable the basic_encoding step, or it is a top necessary to execute, if so why would that?
Not right now, but there isn't a real reason it's required either (as long as the generated pipelines work on the data, everything should be OK). Refactoring it so it is an optional steps is OK with me (in fact, we started redesigning the way GAMA handles input data a while ago (https://github.com/openml-labs/gama/pull/169), but I don't have the time to finish that right now).
An article is always welcome :) but if things are missing from the documentation itself, adding it there too is also much appreciated π Good luck!
@PGijsbers Many thanks for validating this strategy! I greatly appreciate it, and everything has been noted, including the inclusion of more documentation/article (which will most likely occur during the summer). Feel free to close this issue so you do not have too many on the flow, or if you want me to express any issue I encounter using the above-described approach in this thread, then keep it open; otherwise, I will open a new issue (while attempting to be more succinct now that the design's path has been validated on your end, but I still need to try it out to see what is coming).
Have a wonderful week π
I'm closing this issue, but you can open it again if you have follow up questions. Have a great week yourself :)
Dear @PGijsbers and the rest of the authors,
I have finally completed some Ph.D.-related tasks (writing mini-thesis, building a Python library, etc.) and am now returning to finally more serious matters. The construction of an Auto-ML system variant based on your GAMA proposed framework. This (questions-based) issue is a result of the tremendous support I received in issue #191. Therefore, ensure that you relate to it in the event of confusion π«‘
I have investigated all of the routes you suggested, but prior to that, I have got to say that this flexible generic implementation of GAMA is pretty much a gold-mine, some lack of documentation and some highly pythonic types missingnesses are present, but the architecture employed, and all other processes are otherwise very elegant! This surely gives me more hope that I will be able to create a version of my Auto-ML variant based on GAMA. Anyway, as suggested I investigated how to configure a new search space, which makes perfect sense now. The empty list to refer to a shared parameter across all other hyperparameters with the same name in other algorithms is a fantastic feature, however! I also have investigated, about adding a new metric and reworking a pipeline's evaluation.
Following this, I spent a full day researching this topic, and while digging deeper, I also learned what an Individual is. I was initially very confused because it is located in the sub-section of GAMA devoted to ''genetic programming''; as a result, I assumed it was solely related to the evolutionary's algorithm. Nonetheless, it was a mistake on my part, as it is a highly adaptable, generic approach to dealing with individuals, applicable to any case circumstance in which individuals are required. This by using genetic programming operators/primitives, which is also very elegant. I also have discovered about the OperatorSet and other bunch of stuff. Following this, however, I investigated how the Automated Imbalance Machine Learning (paper) implemented their system and what modifications they made to GAMA to make it function as they aimed. Consequently, FYI I must acknowledge that I may undergo a comparable degree of change, thereby pursuing a comparable path. Nonetheless, a couple of concerns have been raised, and I will highlight them below:
In conclusion, I'd like to ask if there's anything else I should be aware of, or any other area you'd suggest I explore?
Please excuse the comprehensive issue. I felt it essential to provide a wealth of detail to ensure that your response can be as informed and directed as possible, thereby eliminating potential back-and-forth questioning if that makes sense π§. With an impending Ph.D. deadline, it also is of paramount importance that I secure a viable proof of concept for my system in the earliest timeframe so I asked as much as I can to have all the tools in hand to make that a done-task.
Lastly, I would like to mention that your support will not solely assist me in developing a variant, plus paper(s) with GAMA's citation, but I also intend to write a brief article (probably Medium-based) to guide any newcomers in using GAMA to develop a GAMA variant Auto-ML system. I believe the advanced section of the documentation is excellent and sufficient for some use cases, but in the implementation of Imbalanced Auto-ML or my current use case, things are a bit more complex and I believe an article would be helpful (unless you object). During the summer, if everything goes as smoothly as I can anticipate, I will start working on that and potentially some PRs suggestion I will see all along πͺ
Wishing you a lovely weekend in Netherlands! Feel free to ask me any single question I'll respond as promptly as possible ! Best wishes,