Open codificat opened 2 years ago
/kind key-result
We have discussed an approach that would reuse logic for template projects. Let's sync if we want to develop and maintain this type of logic in Kebechet.
A "bootstrap" command for the bot causes the repo to be pre-populated with the chosen stack content
I suggest making this the first milestone. It's ok to assume the user knows which stacks are available. So that
does that make sense?
/sig user-experience
A "bootstrap" command for the bot causes the repo to be pre-populated with the chosen stack content
I suggest making this the first milestone.
There is a pre-requisite to this, which is that the stack content is readily-usable. This fits with Frido's comment about template projects.
So that
- I start with an empty repo
.. and install the bot.
- I open an issue and ask the bot to create a PR for e.g. the Image Recognition Stack
- I can merge the PR and have a configured repo,
These 3 steps can be potentially reduced to one by just using the template project logic.
As this is a pre-requisite anyway, I suggest we make that the first milestone. The bot automation can be a follow-up.
I updated the description a bit to reflect that, with the template logic being "phase 1".
Makes sense?
Not sure how you plan to implement this, but sounds like it would require the addition of an ever growing set of template repo's (is that right?). Have you considered using a cookie-cutter [1] like repo that would serve as a single repo, with dynamic options that can be implemented on new repo creations based on the users specific needs?
You can also look at [2] our ds cookie cutter repo for a very simple example.
[1] https://github.com/cookiecutter/cookiecutter
[2] https://github.com/aicoe-aiops/cookiecutter-data-science
Not sure how you plan to implement this, but sounds like it would require the addition of an ever growing set of template repo's (is that right?).
Correct, that was the idea: start initially with the 3 "predictable stacks" we have been working on, but eventually offer more options.
Have you considered using a cookie-cutter [1] like repo
I saw cookie-cutter and the work you are doing with it, and I was planning to look closer at it, starting with one of the repos (see mention of cookie-cutter/your template in https://github.com/thoth-station/ps-nlp/issues/154 as an option to explore).
But I was still thinking on separate repos. Honestly, this did not occur to me:
that would serve as a single repo, with dynamic options that can be implemented on new repo creations based on the users specific needs?
Thanks for the suggestion! I will look closer.
An initial question that comes to mind, though: wouldn't that single repo become too big/complex? e.g. the NLP stack alone already has 4 overlays. One goal of this functionality is to be simple, easy to understand - it is meant to bootstrap/get started, and I am wondering if we would potentially be over-complicating the starting point.
wouldn't that single repo become too big/complex? e.g. the NLP stack alone already has 4 overlays.
Its certainly a trade off to consider. Managing 1 complex repo vs complexity of managing multiple simple repos. Again, depends on how you plan to implement this. Was just presenting a possible suggestion/ alternative.
Would it be as or more complex than https://github.com/operate-first/apps ?
Its certainly a trade off to consider. Managing 1 complex repo vs complexity of managing multiple simple repos. Again, depends on how you plan to implement this. Was just presenting a possible suggestion/ alternative.
Yes, and thanks again for the suggestion, it is being considered.
Would it be as or more complex than https://github.com/operate-first/apps ?
It would not be as complex as that one, no.
/milestone OKR review Q2 2022
/milestone OKR review Q2 2022
/triage accepted /lifecycle active /assign
/remove-lifecycle active as focus this quarter is in a different KR
Problem statement
As a Data Scientist, I want a service that provides me an easy mechanism to bootstrap a new Data Science project, starting from a curated software stack that is appropriate for my project’s goals and available in a shared environment, so that I can quickly start working on the Data Science project tasks without having to invest time in preparing a working environment, and I can be confident that the project is reproducible and maintainable.
High-level Goals
Starting a new Data Science project from scratch, user interacts with a git forge to obtain a git repository populated from a relevant curated software stack, with bots that keep it up to date with recommendations and make the git project readily available to start working on the Data Science tasks.
This involves:
Proposal description
Phase 1
As a Data Scientist, I want to be able to bootstrap a new GitHub repository from an existing template that contains a curated software stack that is relevant to my project.
Phase 1.5 is: automate phase 1 with a script.
Phase 2
As a Data Scientist I want to open an Issue "please create an Image Processing notebook" on GitHub that triggers Thoth bot to start populating my repository:
Alternatives
User manually doing each step
Additional context
Acceptance Criteria