avocado-framework / aexpect

Python library used to control interactive programs
GNU General Public License v2.0
8 stars 32 forks source link

RFC: Split optional modules #114

Closed ldoktor closed 1 year ago

ldoktor commented 1 year ago

the aexpect package is becoming bigger, it might be useful to split it across multiple optional modules. Let's start with "remote" and "ops" modules. The difference is only in way we import those modules, instead of:

from aexpect import remote_door

we should do:

from aexpect_remote import remote_door

although, for compatibility reasons, some of the modules are still available directly in aexpect, when installed on the system.

Signed-off-by: Lukáš Doktor ldoktor@redhat.com

ldoktor commented 1 year ago

Hello @pevogam, @lmr, @clebergnu, as this project grows, I think it'd make sense to split the very core, that can be useful for anything from self automation to driving complex logic and the extra features that doesn't need to be installed all the time. The functionality should be very similar and the release process should not be that more complicated.

Note this PR is just a discussion starter, not intended for merging until refined and polished.

pevogam commented 1 year ago

Hi @ldoktor I am not sure I am really a big fan of this idea. Actually I thought we had already discussed this before and decided that separation and/or migration of modules will be in order if we extend much further than this or eventually get avocado utilities library in the future. So I am not sure how despite our conclusions and thinking this is settled for now we end up with a PR splitting this fairly small project and already moving around the few modules we have added in the years (3 modules a few years ago and 2 modules now).

I am also not sure I understand your lightness argument as these five python modules barely increase the data size of the project and without them the project is more or less a single major python module. So I am not sure on your insistence to keep the project down to this single python module as module boundaries are already good separation that does not necessitate the need for project boundaries (separate projects of atomic modules). All in all, I don't really agree with all added deployment complexity here (separate installs, configuration, version sync and compatibility, etc.) and understand what it gives us back for the maintenance costs. It could be simply that we have different understanding of scope and scale, not really sure.

ldoktor commented 1 year ago

Hello @pevogam, I created the PR to test and see how much complexity will be needed to do so. We might end-up deciding it's not worth it, but without the experiment, we don't know. As for me I'd welcome this (or something similar, even if just two pkgs aexpect and aexpect-extras) because I do like aexpect and I am promoting it to developers for home automation (eg. let's wait until a physical machine reboots and modify some bios value over serial console). I know those extra modules don't add MB of data, but it adds a lot of functionalities where ugly code can hide. Mistakes happen so I'd like to have something really minimalist that can suite many purposes and then have a second (or multiple) sub-projects that will add the remaining features.

The separation should also simplify definition of what is well established code and what is alpha or beta version. Similarly the pylint setting can be made globally per project so we clearly set the boundaries. For example the main aexpect project's classifier says it's mature, while recently there are new APIs that are in alpha stage, still in development.

As for the deployment issues, we can simply define them in the respected setup.py files. In case a new aexpect_remote requires certain aexpect version, let's set it there and pip will take care of the rest. So the complexity is really minimal and everything can live within a repo and can be packaged with a single command.

pevogam commented 1 year ago

Hello @pevogam, I created the PR to test and see how much complexity will be needed to do so. We might end-up deciding it's not worth it, but without the experiment, we don't know. As for me I'd welcome this (or something similar, even if just two pkgs aexpect and aexpect-extras) because I do like aexpect and I am promoting it to developers for home automation (eg. let's wait until a physical machine reboots and modify some bios value over serial console).

I don't see how the extra modules harm the objective of home automation. For what it seems, they enhance it since I bet a lot of home automation would make use of test, file, directory, and path checks, grep, and other standard GNU utilities, all of which is now provided with batteries included. It seems to me from the use case example you give that you rather imagine using aexpect as a single near-boot-level (minimal boot targets and environments) script which seems much less common and more niche use case to me with a generic use case actually making good use of file systems and GNU utilities and thus benefiting from the aexpect provided API.

I know those extra modules don't add MB of data, but it adds a lot of functionalities where ugly code can hide. Mistakes happen so I'd like to have something really minimalist that can suite many purposes and then have a second (or multiple) sub-projects that will add the remaining features.

The "many purposes" is better met with the modules included than without (I consider not having to manually implement means to achieve a purpose as one way of meeting that purpose), the "something really minimalist" part is what I have harder time to figure out. With the new information above I guess you mean "not hide ugly code" and thus induce errors on the side of the user but it would be great if you could elaborate on a sitaution like that so that I can try to understand you intent better. IMO this would rather require improvement of the API itself (extra validation, stricter form) and dropping the API altogether for this group of users is not the best solution here.

The separation should also simplify definition of what is well established code and what is alpha or beta version. Similarly the pylint setting can be made globally per project so we clearly set the boundaries. For example the main aexpect project's classifier says it's mature, while recently there are new APIs that are in alpha stage, still in development.

Every project out there at some point has to or has already introduced new features on top of a stable mainline and at least I haven't seen each such project being split into multiple subprojects or products so that the alpha/beta versions of the new features are properly handled. As all software develops one could always expect parts of it are new and parts of it are old, interfaces or implementations alike. Linting standards should not be factured either instead of simply setting on a unified set of choices for the entire project for all incoming and old code.

As for the deployment issues, we can simply define them in the respected setup.py files. In case a new aexpect_remote requires certain aexpect version, let's set it there and pip will take care of the rest. So the complexity is really minimal and everything can live within a repo and can be packaged with a single command.

Even simply maintaining separate setup.py files (not to mention installing some or all of them and confusing users as to what they need and what they don't via multiple provisions and installation options) seems like too much cost for me and I am not even going into the possibilities of these separate modules actually having different dependencies and versions thereof. Doing all of this for a couple of modules seems like a complete overkill for me at best and like simply wanting to get rid of contributed modules and code at worst. But I might be wrong here, it is simply my impression of what it signals to potential contributors down the line from me (I took care of migrating their code) that just managed to get their pull requests merged days ago only to see them separated right away as if they were forcibly accepted into the mainline.

ldoktor commented 1 year ago

Oups, I accidentally merged this instead of the 113. I forced-pushed the change back as I noticed this the minute I pushed it, anyway let me re-open this so we can discuss it further.

lmr commented 1 year ago

So, my opinions on the subject:

  1. Yes, the library has increased in size, but it is not so large that it would be necessary to split
  2. Since the new functionality introduced by @pevogam has been merged, it makes sense to release it as is
  3. Looking at the current code, I see it could benefit from using namespaces a bit more - ops and remote should become separate directories under aexpect, same as we have for utils. I believe this would make it easier to create separate RPM packages out of the source code.
  4. The problem with 3. is that this could introduce some API incompatible changes to the code, which would force us to handle updates for all the aexpect dependencies. I would like to gauge people's interest on doing that.

So I would say, release what is there per 2., and let's start to consider the restructuring proposed in 3.

lmr commented 1 year ago

We have a difference of opinion on the merits of having all code shipped together or separating it inside the aexpect code.

I believe I have a proposal that could make up for a good compromise between the 2 views:

Create a new library with the higher level constructs that depends on aexpect and host it under the avocado-framework area. @pevogam would be the maintainer, as he contributed most of the code available in it and we can keep aexpect small and self contained. How about that?

pevogam commented 1 year ago

Hi @lmr I agree with your and @ldoktor's point of view but rather at a stage where this code is significant enough to become its own standalone project. Right now I simply don't think that we are ready for this with just a few functions that cover some very rudimentary use cases of remote platforms. At present the project only contains a few modules that are already functionally well separated and adding further namespace separation will result in a single namespace being assigned for each 1-2 modules which is an overprovisioning for the size of the current codebase as it merges the notion of module and namespace to the extend where the second is no longer useful and only complicates usage further. Not that I am referring to @lmr's point (3) here and I am not even going into the complexity added by separate deployment.

Regarding @lmr's point (2), I could argue that this is additional reason for releasing and not separating right now. As this is all brand new functionality and most of aexpect's users are not aware of it yet, keeping the modules together will make everyone aware of their existence via their default installs and deployment pipelines and thus increase the visibility of these modules. If they are immediately removed in separate packages, most people will not even become aware of them when installing aexpect and we are essentially not even giving them a shot at wider adoption.

Also from what I see above, I did not receive any comments or counterarguments to any of my arguments listed in https://github.com/avocado-framework/aexpect/pull/114#issuecomment-1398319873. Some further explanations, details, or pointing to the weaknesses in my own statements there would be highly appreciated! The argument about visibility above an be added to the same list now. Thank you both for your tolerance and continued support regarding these matters!

ldoktor commented 1 year ago

Hello @pevogam, I created the PR to test and see how much complexity will be needed to do so. We might end-up deciding it's not worth it, but without the experiment, we don't know. As for me I'd welcome this (or something similar, even if just two pkgs aexpect and aexpect-extras) because I do like aexpect and I am promoting it to developers for home automation (eg. let's wait until a physical machine reboots and modify some bios value over serial console).

I don't see how the extra modules harm the objective of home automation. For what it seems, they enhance it since I bet a lot of home automation would make use of test, file, directory, and path checks, grep, and other standard GNU utilities, all of which is now provided with batteries included. It seems to me from the use case example you give that you rather imagine using aexpect as a single near-boot-level (minimal boot targets and environments) script which seems much less common and more niche use case to me with a generic use case actually making good use of file systems and GNU utilities and thus benefiting from the aexpect provided API.

The ops module does not have any additional libraries so it's not that critical, but still it's more code that needs to be trusted not to do anything nasty on the target systems. The automation people use it here is rather minimal, usually just rebooting a machine, sometimes setting bios configuration. These are usually one-time scripts that don't really benefit from upper level API.

I know those extra modules don't add MB of data, but it adds a lot of functionalities where ugly code can hide. Mistakes happen so I'd like to have something really minimalist that can suite many purposes and then have a second (or multiple) sub-projects that will add the remaining features.

The "many purposes" is better met with the modules included than without (I consider not having to manually implement means to achieve a purpose as one way of meeting that purpose), the "something really minimalist" part is what I have harder time to figure out. With the new information above I guess you mean "not hide ugly code" and thus induce errors on the side of the user but it would be great if you could elaborate on a sitaution like that so that I can try to understand you intent better. IMO this would rather require improvement of the API itself (extra validation, stricter form) and dropping the API altogether for this group of users is not the best solution here.

The thing is people need to install aexpect on the machines, which means they have to review and trust the library as well as dependencies. The ops module does not have additional dependencies, but the remote one has multiple, which expands the scope. Even without the extra deps one have to overlook the changes and trust us to not to let potential sneaky code to slip in, which is easier done with minimalistic approach.

In terms of features, if we keep things under the same code-base and promote the modules in our documentation (even in install guide) it should be visible enough and those who decide to utilize the features should have the opportunity. It'd be probably nice to promote the tool on some conference, I thought about that but currently have other talks considered. Maybe next year, or you can consider suggesting some... (I'd be interested in the ways you use remote door as well as other features, my usage is solely on ShellSession)

The separation should also simplify definition of what is well established code and what is alpha or beta version. Similarly the pylint setting can be made globally per project so we clearly set the boundaries. For example the main aexpect project's classifier says it's mature, while recently there are new APIs that are in alpha stage, still in development.

Every project out there at some point has to or has already introduced new features on top of a stable mainline and at least I haven't seen each such project being split into multiple subprojects or products so that the alpha/beta versions of the new features are properly handled. As all software develops one could always expect parts of it are new and parts of it are old, interfaces or implementations alike. Linting standards should not be factured either instead of simply setting on a unified set of choices for the entire project for all incoming and old code.

Sure and we had staging for that in autotest. Worst parts of that was always the compatibility layer as people started depending on something that was not really ready. That is why I was pushing a bit more before the release to avoid getting support for ops only to remove it in the next release, if we decided to move it to the other sub-package.

As for the deployment issues, we can simply define them in the respected setup.py files. In case a new aexpect_remote requires certain aexpect version, let's set it there and pip will take care of the rest. So the complexity is really minimal and everything can live within a repo and can be packaged with a single command.

Even simply maintaining separate setup.py files (not to mention installing some or all of them and confusing users as to what they need and what they don't via multiple provisions and installation options) seems like too much cost for me and I am not even going into the possibilities of these separate modules actually having different dependencies and versions thereof. Doing all of this for a couple of modules seems like a complete overkill for me at best and like simply wanting to get rid of contributed modules and code at worst. But I might be wrong here, it is simply my impression of what it signals to potential contributors down the line from me (I took care of migrating their code) that just managed to get their pull requests merged days ago only to see them separated right away as if they were forcibly accepted into the mainline.

I definitely don't want to push you away, most recent changes comes from you and I am glad for that (even though I don't usually utilize the new features I'm glad the tool is alive and somehow active). When I look at this PR I don't see too much work that would be needed to separate those project and as for the release process, you might actually benefit. We'd have stable core that would be released only with bug fixes or core feature changes and you would gain the possibility to release more often new features. The boilerplate code should not be that extensive and changes are usually near-to-none (see git log of the current aexpect).

As for the deployment, we should improve our README, add install steps there and promote the additional modules there. That way it'd be also visible on PYPI. Plus the conf talk might attract some users.

pevogam commented 1 year ago

I don't see how the extra modules harm the objective of home automation. For what it seems, they enhance it since I bet a lot of home automation would make use of test, file, directory, and path checks, grep, and other standard GNU utilities, all of which is now provided with batteries included. It seems to me from the use case example you give that you rather imagine using aexpect as a single near-boot-level (minimal boot targets and environments) script which seems much less common and more niche use case to me with a generic use case actually making good use of file systems and GNU utilities and thus benefiting from the aexpect provided API.

The ops module does not have any additional libraries so it's not that critical, but still it's more code that needs to be trusted not to do anything nasty on the target systems. The automation people use it here is rather minimal, usually just rebooting a machine, sometimes setting bios configuration. These are usually one-time scripts that don't really benefit from upper level API.

I see, I just wonder if this is the main target audience we are after or the users that want to automate stuff through sessions on more or less regular platforms (thus needing some file manipulation and the like). Could you provide an example where a user could do something really nasty simply by using a command like ops.ls or some of the other functions we provide there?

The "many purposes" is better met with the modules included than without (I consider not having to manually implement means to achieve a purpose as one way of meeting that purpose), the "something really minimalist" part is what I have harder time to figure out. With the new information above I guess you mean "not hide ugly code" and thus induce errors on the side of the user but it would be great if you could elaborate on a sitaution like that so that I can try to understand you intent better. IMO this would rather require improvement of the API itself (extra validation, stricter form) and dropping the API altogether for this group of users is not the best solution here.

The thing is people need to install aexpect on the machines, which means they have to review and trust the library as well as dependencies. The ops module does not have additional dependencies, but the remote one has multiple, which expands the scope. Even without the extra deps one have to overlook the changes and trust us to not to let potential sneaky code to slip in, which is easier done with minimalistic approach.

Could you point me to specific dangerous dependencies that remote introduces? Perhaps there is a way we can deal with this better?

It'd be probably nice to promote the tool on some conference, I thought about that but currently have other talks considered. Maybe next year, or you can consider suggesting some... (I'd be interested in the ways you use remote door as well as other features, my usage is solely on ShellSession)

I would definitely like to demonstrate the uses of the remote door of which we now have lots of production code with extensive benefiting and I am sure it would help a lot understanding the power and flexibility all the approaches there provide. But regarding the way to advertise the new methods right now, we could fall back to the same way we did it with remote and the remote_door - by including the code and making it available and seeing if people would pick up on some use. I still find the extensions added by ops way too rudimentary in order to consider it big feature that needs its own documentation and presentations though. I would agree more about the remote_door and possibly even remote but these are already part of aexpect as-is and at present we are blocking the release with ops and not with the already integrated two modules that are more worth this consideration.

Every project out there at some point has to or has already introduced new features on top of a stable mainline and at least I haven't seen each such project being split into multiple subprojects or products so that the alpha/beta versions of the new features are properly handled. As all software develops one could always expect parts of it are new and parts of it are old, interfaces or implementations alike. Linting standards should not be factured either instead of simply setting on a unified set of choices for the entire project for all incoming and old code.

Sure and we had staging for that in autotest. Worst parts of that was always the compatibility layer as people started depending on something that was not really ready. That is why I was pushing a bit more before the release to avoid getting support for ops only to remove it in the next release, if we decided to move it to the other sub-package.

I see, this still doesn't sound like the way major software projects would do it though and it definitely seems like everyone with a viable software product should have the same problem. So I wonder what would you think would be the major solution here since simply pushing harder sounds more like duck-taping a bigger actual problem. I could see for instance the same thing happening recursively - we move the ops and few other modules to their own package, it is marked as "alpha", then eventually it gets its own new experimental features, and using the same argument we will have to move these in yet another subproject of a subproject in order to achieve the same effect there. Or do you suggest something else here?

Even simply maintaining separate setup.py files (not to mention installing some or all of them and confusing users as to what they need and what they don't via multiple provisions and installation options) seems like too much cost for me and I am not even going into the possibilities of these separate modules actually having different dependencies and versions thereof. Doing all of this for a couple of modules seems like a complete overkill for me at best and like simply wanting to get rid of contributed modules and code at worst. But I might be wrong here, it is simply my impression of what it signals to potential contributors down the line from me (I took care of migrating their code) that just managed to get their pull requests merged days ago only to see them separated right away as if they were forcibly accepted into the mainline.

I definitely don't want to push you away, most recent changes comes from you and I am glad for that (even though I don't usually utilize the new features I'm glad the tool is alive and somehow active). When I look at this PR I don't see too much work that would be needed to separate those project and as for the release process, you might actually benefit. We'd have stable core that would be released only with bug fixes or core feature changes and you would gain the possibility to release more often new features. The boilerplate code should not be that extensive and changes are usually near-to-none (see git log of the current aexpect).

As for the deployment, we should improve our README, add install steps there and promote the additional modules there. That way it'd be also visible on PYPI. Plus the conf talk might attract some users.

I agree that ultimately the freedom of releasing and independently developing, merging, and experimenting are all benefits and I would definitely go that way if I am actually sure that we will do a lot of additional development. No further changes were needed on the remote module and remote door on our side for years now (what they provide is all pretty stable and self-sufficient at this point) and while the ops might indeed look like something that could see a lot of future development, my original suggestion with it is that we want to see if people would really like to pick up on it. Since we are also considering migrating the ops modules to some future "avocado utils" project, I am not sure if it is worth it to subproject this right now with all the documentation, README, workflows, and the like if it ends up migrating there in the long term (I am sourcing this from our original discussion in previous pull requests).

So all in all, it all seems way to conditional for me right now in order to think we should invest all this effort and I would only do this if the modules see future development which is more or less coincident with them growing in size and code base (my original "if" wording from before). But let me see what your thoughts are, thank you for providing proper counter-arguments from my previous arguments so that we can both align our conclusions better.

ldoktor commented 1 year ago

I don't see how the extra modules harm the objective of home automation. For what it seems, they enhance it since I bet a lot of home automation would make use of test, file, directory, and path checks, grep, and other standard GNU utilities, all of which is now provided with batteries included. It seems to me from the use case example you give that you rather imagine using aexpect as a single near-boot-level (minimal boot targets and environments) script which seems much less common and more niche use case to me with a generic use case actually making good use of file systems and GNU utilities and thus benefiting from the aexpect provided API.

The ops module does not have any additional libraries so it's not that critical, but still it's more code that needs to be trusted not to do anything nasty on the target systems. The automation people use it here is rather minimal, usually just rebooting a machine, sometimes setting bios configuration. These are usually one-time scripts that don't really benefit from upper level API.

I see, I just wonder if this is the main target audience we are after or the users that want to automate stuff through sessions on more or less regular platforms (thus needing some file manipulation and the like). Could you provide an example where a user could do something really nasty simply by using a command like ops.ls or some of the other functions we provide there?

Perhaps I see the confusion, I wasn't talking about usage issues, but about potential threats. Be it nasty dependency or just code that might slip in due to lack of reviews. Doing audit of the core aexpect is fairly simple, but especially the remote extensions are quite complex and add dependencies. Having minimalistic stable core that perform tasks that were sufficient for autotest and avocado-vt for decades and is well reviewed and established, while having plugins or additional optional libraries for those who are interested in the extensions would be very nice and in this PR I wanted to experiment with how much work would that be. I think it would be actually quite simple and it would even help with delegating the maintainship to those, who are responsible for most of the new code. And provided we link and promote the setup in the correct places, people should simply install the aexpect_modules (if we merged remote and ops into a single pkg) or aexpect_remote aexpect_ops to get all the features, or just aexpect to get the well reviewed, minimalistic core library for basic automation.

The "many purposes" is better met with the modules included than without (I consider not having to manually implement means to achieve a purpose as one way of meeting that purpose), the "something really minimalist" part is what I have harder time to figure out. With the new information above I guess you mean "not hide ugly code" and thus induce errors on the side of the user but it would be great if you could elaborate on a sitaution like that so that I can try to understand you intent better. IMO this would rather require improvement of the API itself (extra validation, stricter form) and dropping the API altogether for this group of users is not the best solution here.

The thing is people need to install aexpect on the machines, which means they have to review and trust the library as well as dependencies. The ops module does not have additional dependencies, but the remote one has multiple, which expands the scope. Even without the extra deps one have to overlook the changes and trust us to not to let potential sneaky code to slip in, which is easier done with minimalistic approach.

Could you point me to specific dangerous dependencies that remote introduces? Perhaps there is a way we can deal with this better?

Again, the module seems fine by me (not a security auditor), but it's more code to be reviewed and additional dependencies that need to be installed, trusted and audited with each release.

It'd be probably nice to promote the tool on some conference, I thought about that but currently have other talks considered. Maybe next year, or you can consider suggesting some... (I'd be interested in the ways you use remote door as well as other features, my usage is solely on ShellSession)

I would definitely like to demonstrate the uses of the remote door of which we now have lots of production code with extensive benefiting and I am sure it would help a lot understanding the power and flexibility all the approaches there provide. But regarding the way to advertise the new methods right now, we could fall back to the same way we did it with remote and the remote_door - by including the code and making it available and seeing if people would pick up on some use. I still find the extensions added by ops way too rudimentary in order to consider it big feature that needs its own documentation and presentations though. I would agree more about the remote_door and possibly even remote but these are already part of aexpect as-is and at present we are blocking the release with ops and not with the already integrated two modules that are more worth this consideration.

https://cfp.devconf.info/

Every project out there at some point has to or has already introduced new features on top of a stable mainline and at least I haven't seen each such project being split into multiple subprojects or products so that the alpha/beta versions of the new features are properly handled. As all software develops one could always expect parts of it are new and parts of it are old, interfaces or implementations alike. Linting standards should not be factured either instead of simply setting on a unified set of choices for the entire project for all incoming and old code.

Sure and we had staging for that in autotest. Worst parts of that was always the compatibility layer as people started depending on something that was not really ready. That is why I was pushing a bit more before the release to avoid getting support for ops only to remove it in the next release, if we decided to move it to the other sub-package.

I see, this still doesn't sound like the way major software projects would do it though and it definitely seems like everyone with a viable software product should have the same problem. So I wonder what would you think would be the major solution here since simply pushing harder sounds more like duck-taping a bigger actual problem. I could see for instance the same thing happening recursively - we move the ops and few other modules to their own package, it is marked as "alpha", then eventually it gets its own new experimental features, and using the same argument we will have to move these in yet another subproject of a subproject in order to achieve the same effect there. Or do you suggest something else here?

Yes, at the same time some projects grow bigger and certain parts become useful on their own and can live independent life (like GTK).

Even simply maintaining separate setup.py files (not to mention installing some or all of them and confusing users as to what they need and what they don't via multiple provisions and installation options) seems like too much cost for me and I am not even going into the possibilities of these separate modules actually having different dependencies and versions thereof. Doing all of this for a couple of modules seems like a complete overkill for me at best and like simply wanting to get rid of contributed modules and code at worst. But I might be wrong here, it is simply my impression of what it signals to potential contributors down the line from me (I took care of migrating their code) that just managed to get their pull requests merged days ago only to see them separated right away as if they were forcibly accepted into the mainline.

I definitely don't want to push you away, most recent changes comes from you and I am glad for that (even though I don't usually utilize the new features I'm glad the tool is alive and somehow active). When I look at this PR I don't see too much work that would be needed to separate those project and as for the release process, you might actually benefit. We'd have stable core that would be released only with bug fixes or core feature changes and you would gain the possibility to release more often new features. The boilerplate code should not be that extensive and changes are usually near-to-none (see git log of the current aexpect). As for the deployment, we should improve our README, add install steps there and promote the additional modules there. That way it'd be also visible on PYPI. Plus the conf talk might attract some users.

I agree that ultimately the freedom of releasing and independently developing, merging, and experimenting are all benefits and I would definitely go that way if I am actually sure that we will do a lot of additional development. No further changes were needed on the remote module and remote door on our side for years now (what they provide is all pretty stable and self-sufficient at this point) and while the ops might indeed look like something that could see a lot of future development, my original suggestion with it is that we want to see if people would really like to pick up on it. Since we are also considering migrating the ops modules to some future "avocado utils" project, I am not sure if it is worth it to subproject this right now with all the documentation, README, workflows, and the like if it ends up migrating there in the long term (I am sourcing this from our original discussion in previous pull requests).

So all in all, it all seems way to conditional for me right now in order to think we should invest all this effort and I would only do this if the modules see future development which is more or less coincident with them growing in size and code base (my original "if" wording from before). But let me see what your thoughts are, thank you for providing proper counter-arguments from my previous arguments so that we can both align our conclusions better.

Long time ago I talked to python developers on FUDconf, showed them avocado utils and discussed the future of it. They loved them and mentioned we should ideally separate them into individual packages and release them individually. At that time that sounded crazy to me, so much boilerplate code, so much duplicity and work to maintain individual pieces. But the more audits we get here and the more I see how rust crates works I'm becoming fond of individual separated pieces that people can reuse in their projects. Aexpect-core is such a piece. Ops and remote door looks like good candidates as well, although they are quite bound to aexpect.

All in all, to me it seems worth the effort, I'd welcome slim unix-like tool that does one job and does it well, while having the option to install additional module(s) to gain more features. Ideally I'd split all of them to individual, unrelated pieces, but having core and modules that would contain the rest would be also welcome by me. As for the current situation, I can live with it, but to me the benefits are way bigger than the effort that would this split require.

pevogam commented 1 year ago

I see, I just wonder if this is the main target audience we are after or the users that want to automate stuff through sessions on more or less regular platforms (thus needing some file manipulation and the like). Could you provide an example where a user could do something really nasty simply by using a command like ops.ls or some of the other functions we provide there?

Perhaps I see the confusion, I wasn't talking about usage issues, but about potential threats. Be it nasty dependency or just code that might slip in due to lack of reviews. Doing audit of the core aexpect is fairly simple, but especially the remote extensions are quite complex and add dependencies.

Why do you suggest lack of reviews in particular? It think this simplicity might be biased towards your specific point of view. Code and the way the initial utilities function seem easier to you as you were one of the main developers to write them and probably know them pretty well but in the same way they don't seem as straightforward to me (I have never used anything beyond the final session classes in the inheritrance hierarchy) and I could argue the same thing about the remote modules and ops the same way you do. So again, could you provide a specific example of a way in which the ops package increases the vulnerability of the project for this target group of users (as it seems we also see different potential target audiences here)? Otherwise I would prefer not to use terms like "simple/complex" unless it could be quantified in some clear way and we have both already agreed that the size of the code base has not increased enough to be a criterion here.

When it comes to dependencies (something that could be more absolute and easier to align from multiple points of view/history/development) the fact that you don't point me to specific cases keeps the ball in this same vague playground so let's see if I can come up with answers regarding this: the only major dependency I can think about that might seem like a big and non-standard add-on is probably Pyro4. It is not really possible to provide any remote python objects without it and I would gladly drop it if there was one. I am generally also in favor of dropping dependencies as much as possible to avoid deployment attacks and other sources of stability/security vulnerabilities but I prefer the moderate approach here where we drop dependencies for small functionality we can implement ourselves and leave only major dependencies we cannot go without. So I would not use log4j for text color for instance but it is important not to go to the opposite side of the spectrum either. Then to add a second point to this which is slightly reiterated (you didn't seem to comment on it), this concerns the remote utilities and the current release is blocked because of the recently added ops package. I would prefer to either focus the conversation on the ops package itself (in which case I am still not aware what difficult dependencies it adds so far) or have us discuss the bigger picture over the longer term but not relate this pull request to the ops package and its release specifically.

Having minimalistic stable core that perform tasks that were sufficient for autotest and avocado-vt for decades and is well reviewed and established,

I can't say I agree they were sufficient - don't forget that a lot of the code you suggest we move to a separate package has been migrated directly from avocado-vt because people there needed it for decades and simplify developed it within their own project scope. The same can be said for the numerous cases where I have seen people redevelop functions for basic aexpect ops again and again in various pull requests in those projects.

while having plugins or additional optional libraries for those who are interested in the extensions would be very nice and in this PR I wanted to experiment with how much work would that be. I think it would be actually quite simple and it would even help with delegating the maintainship to those, who are responsible for most of the new code. And provided we link and promote the setup in the correct places, people should simply install the aexpect_modules (if we merged remote and ops into a single pkg) or aexpect_remote aexpect_ops to get all the features, or just aexpect to get the well reviewed, minimalistic core library for basic automation.

So again "actually quite simple" is really relative and entangled with points of view with the only more precise argument being "delegating the maintainership to those responsible" which can be achieved with me helping out with maintenance and code review of the relevant code, something that I don't mind doing. If this is your concern than we have simpler ways of doing this by assigning me on the relevant parts of the code and not going all the way to separate promotions and deployments. Let me clarify my use fo "simple" here: it implies less action and organizaional effort has to be taken to assign me the relevant areas of code (A) versus assigne me the relevant areas of code and create a new project (A+B).

I would definitely like to demonstrate the uses of the remote door of which we now have lots of production code with extensive benefiting and I am sure it would help a lot understanding the power and flexibility all the approaches there provide. But regarding the way to advertise the new methods right now, we could fall back to the same way we did it with remote and the remote_door - by including the code and making it available and seeing if people would pick up on some use. I still find the extensions added by ops way too rudimentary in order to consider it big feature that needs its own documentation and presentations though. I would agree more about the remote_door and possibly even remote but these are already part of aexpect as-is and at present we are blocking the release with ops and not with the already integrated two modules that are more worth this consideration.

https://cfp.devconf.info/

Thanks! However I lack counterarguments to the arguments laid down here.

I see, this still doesn't sound like the way major software projects would do it though and it definitely seems like everyone with a viable software product should have the same problem. So I wonder what would you think would be the major solution here since simply pushing harder sounds more like duck-taping a bigger actual problem. I could see for instance the same thing happening recursively - we move the ops and few other modules to their own package, it is marked as "alpha", then eventually it gets its own new experimental features, and using the same argument we will have to move these in yet another subproject of a subproject in order to achieve the same effect there. Or do you suggest something else here?

Yes, at the same time some projects grow bigger and certain parts become useful on their own and can live independent life (like GTK).

I am also seem to be lacking your counterarguments here. Parts of software projects growing bigger and becoming more useful on their own does not explain having to cut them out while they are still small or what the solution above would be. Is cutting those parts of software out in early experimental stages the solution you propose to this common problem of software maturation? How about any comment/counterargument to the recursive problem I pointed out above? Right now I don't find your argumentation exhaustive enough here to see the flaws in the one I provided.

I agree that ultimately the freedom of releasing and independently developing, merging, and experimenting are all benefits and I would definitely go that way if I am actually sure that we will do a lot of additional development. No further changes were needed on the remote module and remote door on our side for years now (what they provide is all pretty stable and self-sufficient at this point) and while the ops might indeed look like something that could see a lot of future development, my original suggestion with it is that we want to see if people would really like to pick up on it. Since we are also considering migrating the ops modules to some future "avocado utils" project, I am not sure if it is worth it to subproject this right now with all the documentation, README, workflows, and the like if it ends up migrating there in the long term (I am sourcing this from our original discussion in previous pull requests).

Would you mind answering my concern about the ops package migrating to a future avocado utils project instead?

So all in all, it all seems way to conditional for me right now in order to think we should invest all this effort and I would only do this if the modules see future development which is more or less coincident with them growing in size and code base (my original "if" wording from before). But let me see what your thoughts are, thank you for providing proper counter-arguments from my previous arguments so that we can both align our conclusions better.

Long time ago I talked to python developers on FUDconf, showed them avocado utils and discussed the future of it. They loved them and mentioned we should ideally separate them into individual packages and release them individually. At that time that sounded crazy to me, so much boilerplate code, so much duplicity and work to maintain individual pieces. But the more audits we get here and the more I see how rust crates works I'm becoming fond of individual separated pieces that people can reuse in their projects. Aexpect-core is such a piece. Ops and remote door looks like good candidates as well, although they are quite bound to aexpect.

I don't consider my understanding of this immature in any way and as something that will eventually evolve into appreciating Rust crates and modular deployment. In my opinion, the current project is already small and separated from lots of other potentially too coupled projects and it is thus modular enough. My reasoning is rather somewhat mathematical rather than rooted in the specific code - if we were to separate python modules into separately installable python packages then historically we would have never ended up having either the concept of "module" or the concept of "package" since the original ones would coincide. I can use the same argument to suggest we split into aexpect-code, aexpect-door, aexpect-remote, and aexpect-ops with each package having separate maintainer, documentation, and packaging, but while this might seem fine-grained down to functionality it is also (not perfectly but yet to a pretty good approximation) fine-grained down to the module level.

@ldoktor In the end you are the boss here and can decide the future of the project so we will all go ahead with your vision about it. I just want to underline that I remain unconvinced as most of my arguments have not been countered in any way and I don't see a lot of new (not overlapping with what was already stated) arguments on your side.

@lmr Do you have any final opinion here?

As our deployment of all this merged functionality really hangs for weeks now hoping to get a release I would be glad if we either decide one way or another or alternatively don't block a lot of new functionality from a release if we want to keep this discussion ongoing.

pevogam commented 1 year ago

@lmr Would you mind giving us your final call on this by the end of the week?

I think I made a mistake by posting here since the pull request was merged and then a new one was opened in its place where you last posted, I glossed over that and kept posting in the original pull request (here).

lmr commented 1 year ago

Ok, that was a lot of discussion to catch up to.

I wanted to find something that was a reasonable compromise between @ldoktor and @pevogam's perspectives, that is why I attempted to go with the module separation proposal.

I also did not want to drive away @pevogam, as he has been a reliable contributor, and that is why I felt like it was more than deserved that you get a place as a maintainer so that you would get to drive the new module under the avocado umbrella.

But it seems like my proposal once again does not make either completely happy, I feel like I have to flip a coin and make a call. I see pros and cons of both approaches (split or ship it as is) and do not have strong feelings towards any of the outcomes, TBH.

Since I will have to make a call, that call is to merge the new modules and release them as is, as this respects the time @pevogam has put into the module and his help over the years, and the people using aexpect can live with the extra code. Since @ldoktor has been suggesting colleagues to use aexpect to use only its core functionality, they would stay using the functionality and then the new APIs wouldn't matter.

My advice to @pevogam would be to structure the import calls on his projects to transition to a possible future where we would like to split the modules, but with no immediate commitment to pursue such a future.

Thanks for the opinions and productive discussion!

ldoktor commented 1 year ago

OK, thanks Lucas, it's always a tough call. As for the future plans, @pevogam I promise to keep acking and merging stuff that uses standard imports, but once the extra requirements (usual and well covered are fine by me, just the esoteric or less covered ones) will be required, I'll likely reopen this question.

pevogam commented 1 year ago

Hi both, I would gladly take up more responsibilities, just in the right way and on code base that is large enough to require a separate project. I am already taking more responsibilities here without the privileges that come with these which I am sure you will believe me is actually harder. So right now I fully commit to helping with maintenance of all contributed code and if you have any further suggestions on how we could facilitate this without splitting just 1-2 contributed modules feel free to suggest and even if takes more compromises on my side I am willing to make them.

OK, thanks Lucas, it's always a tough call. As for the future plans, @pevogam I promise to keep acking and merging stuff that uses standard imports, but once the extra requirements (usual and well covered are fine by me, just the esoteric or less covered ones) will be required, I'll likely reopen this question.

I don't think there is any problem with this, my argument was about 1-2 modules and not about an entire library of such that all offer their own extensions to begin with. Note I don't consider adding windows ops in addition to linux ops as something that screams about being a separate project, rather a functionality which takes the current use too far. But even with the current functionality we already plan to migrate to avocado utils repository or aexpect's own repository should the code base really increase and by that I mean become less "skinny" than what the current ops provide (something that compared to the scale of entire operating systems is definitely minimal at its current state IMO).

pevogam commented 1 year ago

@ldoktor I assume you are not performing releases and @lmr is the only person able to do this?

pevogam commented 1 year ago

Alright so I assume no release by the end of the week then :blush: Sorry, on my side the reason to wanted to move on with this was that we have had unmerged branches that are waiting for an aexpect release (I actually thought it will happen much quicker) for 2+ weeks now and I am taking a one-week absence next week, meaning that we will have to endure with these patches for even longer than originally anticipated. But I guess it is simply not possible to do this on your side, just wanted to explain why I was like this on my side here.

ldoktor commented 1 year ago

Hello @pevogam, I pinged @lmr and he promised to release it today or tomorrow (he's doing this in his free time, it's not his current company active project...). I'm sorry about the delays, I know you wanted to release it earlier but I wanted to clarify things before we reach "production". Feel free to share the deadlines with us next time so we can better prioritize.

pevogam commented 1 year ago

Hi @ldoktor and thanks so much for the feedback! I can also do some work tomorrow (on Saturday) but I am at a very distant time zone than the European one so let's see if there is any chance it happens by any time tonight.