rdadolf / fathom

Reference workloads for modern deep learning methods.
Apache License 2.0
73 stars 21 forks source link

Create small debug dataset #4

Open rdadolf opened 8 years ago

rdadolf commented 8 years ago

We can't expect to distribute real datasets. First, they're hundreds of GB, so it's a logistical headache. Second, we don't have the license approval to do so, even if we could.

Two parts to a solution:

  1. Enable the use of large datasets when/if they're available.
  2. Distribute a smaller datasets which we can give out. The performance will be wrong, but it should work. As a bonus, it'll make debugging tools and writing modeling tests easier.
rdadolf commented 8 years ago

Atari will be a problem, since it's highly unlikely that we'll be able to generate a test ROM. The best solution I can think of is probably to use a non-ALE debug input (i.e., a class which emulates the emulator API and pretends to be a game). Alternatively, we can simply not provide one for atari, but I'd rather have that as a last resort.