External Computing - Githubissues

moriarty commented 7 years ago

This issues relates to: #128 and #173 which have been closed and answered. And the rules are clear enough for now in 3.6 of the RuleBook, but they could use some further clarifications. One reason I say for now, is that I haven't seen any teams take advantage of them. Has someone been keeping track?

I think it's great opportunity, and while competing last year our team considered it... Not to use an external cloud based option, but to have our own developer box on the network.

I also agree with the point someone brought up in the previous issue about fairness. Especially in the new Standard Platform leagues. The variety of manipulators seen in the OPL

The main issue, was according to Section 3.6 item 5 "Availability" require us to make this publicly available, for use by robots of other teams, well before and after the competition.

It's a fine rule, and it would be great to see, but it's just hard to implement on the team side. Last year, I did spend some time looking into it, but in the end we upgrade one of our onboard computers and didn't think it was worth it...

Considerations we had:

Onsite internet connection has been bad in previous competitions. In 2 Robocup competitions, I have returned to the hotel to download some packages... To get around this, we travel with our own Linux, ROS and git mirror. But for this reason we only seriously considered having an extra machine in the arena, connected to the network, not renting or using the university resources.
Cloud Computer resources are expensive, and most Robocup budgets stretched... Most universities have some access to compute clusters, however, these resources are often impossible to allow other non registered users to access them. They're usually shared resources, and hard to schedule on demand jobs, or switch tasks. Having these resources available, means there is little incentive spend our budgets to use publicly available resources like AWS. Even if we could have gotten access to our So what does it mean- publicly available for use by the other teams?
If it needs to be available to other teams, and it's a custom solution, then we also need to support the other teams? what if it's not stable?
API vs. Computing Resources. Running a service continuously and publicly and making the API available, vs leaving a machine available where any other team can spin up a node (docker or openstack, vmware etc- but that requires restricting the teams during test runs) And how to limit which team access it during a test in the case of a shared resource.
Limit of 5 "external computing resources" computing resources in this sense can be a little fuzzy... For example, my home network has multiple machines, more than the 5 machine limit physically, but to me they act as one so I could claim they are one resource. One of those machines, has a xeon with multiple network connections, and it runs (at least) 5 virtual machines, which act to the the network as if they were physical, but they spin up on demand, is that one resource or 5? ... And that's just my home network. It gets fuzzier when you take into account the mix of hardware available to rent on the cloud, or that a university research grant could purchase. Is a Nvidia DGX-1 one resource? I agree there should be a limit, but depending on how it's specified it may need to be adjusted.
If a team chooses to use a cloud based compute resource, for example, rent some OpenStack compute nodes from @Swat30 at @CloudA [ disclosure: tagging a friend because I'm not an expert ] and have the ability to spin them up on demand just before the competition. What needs to be communicated to the other teams? I don't equate publicly available with free. Does a team need to say: I'm using OpenStack, you can rent from any of x,y or z and here is how to spin up a node... Or must they say I'm using OpenStack, here is my image+code, and here is how you spin up a node... Or must the team continuously run those nodes and make them available?
It would be nice to have a mechanism to test the connectivity rule, that the robot should still function, like the e-stop is tested during qualifications.

Also, I assume, that external computing means during competition. Because training neural networks can be done before the competition tests and is the compute expensive part & time consuming part, using that network later on the robot doesn't require so many resources. Is this considered External Computing?

I would like to see this, because as a team member in this league and another league I've experienced the pain of low compute power (I don't consider an intel i7 and Jetson to be low compute). Yet, at the same time it's painful to see how the resources are immediately wasted once they're available, so it is nice to have the competition pushing solutions which do work with the limited compute resources.

I think if the league wants to see this, the section could be modified to make it seem more encouraging, yet stricter specifications.

Most of the time, the teams have a ton of other items on their issue lists. It's already challenging enough to allocate enough of the team member hours to development and testing a mobile robot with a standard wifi network in a new setting, adding external compute resources into the mix adds another level of complexity and risk, which could result in not being able to show off the hard work and research that went into the rest of the system. So the weight of risk vs reward is high toward risk, which can result in teams not trying for external compute resources and instead opting to further fine tune and overfit their solutions to the tasks.

Sorry- this became less of a small issue list and more of a rant.

LoyVanBeek commented 7 years ago

That's a big rant... :-)

In all honesty, there's a slight preference to have self-contained robots. But if you must, it is possible and we don't want to restrict anyone though keep things fair. So far, the only requests I've seen for approval are the use of publicly hosted API's, unaffiliated to a team and also usable by other teams.

IIRC, there have been no requests to use private compute resources for use during a challenge, but when needed, we'll judge then.

If you want to train a neural net with data gather during a challenge but do the training outside the challenge, you can use whatever you want. Of course you can use the trained network for inference on the robot during the challenge, that is not considered external computing. Sharing the neural net would be nice, but that is besides the issue here.

How would you change the section to be more encouraging?

kyordhel commented 7 years ago

I am pasting here an answer I gave to [Undisclosed] from Toyota.

The rule means that software running in external computers must

A) be open source (BSD/GPL/etc) B) product detailed information (e.g. Vendor, patent number, etc.) and interfacing must be published for scientific use

It seems to me that A and B are mutually exclusive. In consequence, under normal circumstances teams can choose one or another.

Clarifying. Although for companies make sense to keep secret the successful result of their investments, for scientific community all solutions that cannot be at least accessed for comparison are irrelevant and useless. For instance, some teams have used Microsoft SAPI for speech recognition. The API is closed and one can't but guess how it works, but it is accessible for other teams to try and test, and therefore, for scientific benchmarking.

We saw no need of specifying this more in detail in the rulebook since @Home is a scientific competition which has solution sharing among teams as part of its objectives.

Also, I would like to remark that the rulebook must be as compact as possible, and all extra information should be maintained in a Wiki (e.g. this repo wiki). Maybe @moriarty or any other OC member could help us with that.

moriarty commented 7 years ago

There where some replies to the OC mailing list which clarified things.

The rule means that software running in external computers must

A) be open source (BSD/GPL/etc) B) product detailed information (e.g. Vendor, patent number, etc.) and interfacing must be published for scientific use

[...]

In summary, we want scientific solutions, no magic boxes that "somehow" make the robot work.

The above reply helped clarify most of my questions.

Now, towards making it more encouraging, clarity on the of public availability is likely enough.

When I was competing last year, we looked into external devices. Our robot doesn't have a GPU for some of the vision tasks, and it would have been nice to have a higher power CPU for some of the task planning and motion planning tasks. And we thought a Nvidia TX-1 would be "good enough", and a reasonable price to have in a real "smart home", or something with a Xeon D-1587 type specs.

However, when I read that it had to be publicly available, we thought that meant more effort that it was worth. I read that, and thought it implied, that if I install my decide in the arena, and other teams have access to it, for which I would need to support them, and then if we wanted to test/prepare, we'd need to share schedule with other teams to use our own device. The only way I saw that achievable- would be to use OpenStack/KVM/Docker/... etc

We can run the task planning or motion planning on board, but running it on a higher core count CPU achieves better results, and likely safer or more optimal plans. But also, just for example, if the device is detected and the ping rate high enough, than it can be used in parallel... We never got that far implementation wise we stopped at "how do we give the other teams public access to this external compute device?" for which we considered how to share the hardware.

But from the clarification, I now understand that a team could install a device, as long as it's running OSS, or use publicly available web APIs, or both because up to 5 external compute devices could be used. So if a team installs a device, and runs MoveIt!, with publicly available planners, this is fine. Of if they install a device, and run some OSS task planner- this is also fine. But if a lab is developing a task planner or a motion planning algorithm which they haven't yet released- then they need to make it OSS to run it on an external device, but they don't need to share that physical device with the other teams.

It might also be worth being explicit in stating what you mentioned in the last point.

If you want to train a neural net with data gather during a challenge but do the training outside the challenge, you can use whatever you want. Of course you can use the trained network for inference on the robot during the challenge, that is not considered external computing.

Because I think that's the issue which is most relevant to the developers on the teams. The teams should already be collecting the data according to 3.4.

moriarty commented 7 years ago

@kyordhel I just saw your reply so I hit the comment button.

RoboCupAtHome / RuleBook

External Computing #243