golemfactory / ray-on-golem

GNU General Public License v3.0
6 stars 4 forks source link

List of missing stuff #18

Closed mateuszsrebrny closed 1 year ago

mateuszsrebrny commented 1 year ago

This is a placeholder issue to list missing stuff without making mess in the backlog

### Tasks
- [ ] use `golem-registry` for resolving image tags into image hashes
- [ ] discover local python & ray version to choose proper image
- [ ] complete image matrix
- [ ] mid-agreement payments
- [ ] test on win & mac
- [ ] support for ray 2.5.0
- [ ] check what happens when we rent a bigger requestor than yaml specifies (do we use more cores? do we pay for them?)
- [ ] yagna installed from pip
- [ ] noticing and reacting to failing providers
- [ ] use socket proxy from golem-core (aka break dependency on yapapi) (in order to make sshing to nodes easier)
- [ ] use golem-core way to work with vpn capabilities (hardcoded `"vpn"` atm)
- [ ] support other needed `ray up/down` api elements
- [ ] support other needed cluster yaml elements
- [ ] make sure pip install works from cluster.yaml
- [ ] tools to notice, build, verify & publish (registry, docs) images for new python & ray versions
- [ ] stats - measure ray on golem usage
- [ ] blacklisting providers that failed us
- [ ] choosing providers that were dependable
- [ ] keeping activities ready before autoscaler demands new  (or before nodes fail and need to be replaced)
- [ ] try alpine to make image even smaller (~300mb right now)
- [ ] `/tmp/golem` not needed / not created manually
- [ ] dashboard is enabled
- [ ] golem-ray logs to debug / send to us
- [ ] make ray use a proper directory instead of `~` for storing bootstrap config and ray apps (on local head node)
mateuszsrebrny commented 1 year ago

cleaning issues (issue no longer needed)