red-data-tools / red-datasets

A RubyGem that provides common datasets
MIT License
30 stars 25 forks source link

support CIFAR-10, CIFAR-10 datasets #7

Closed hatappi closed 6 years ago

hatappi commented 6 years ago

add CIFAR-10, 100 datasets.

Usage

CIFAR1-10

> require 'datasets'
> c = Datasets::Cifar.new
> c.metadata
# => #<struct Datasets::Metadata name="CIFAR-10", url="https://www.cs.toronto.edu/~kriz/cifar.html", licenses=nil, description="CIFAR-10 is 32x32 image datasets">
> c.each do |r|                                                                                                                                                        
    p r.data
    # => [59, 43, 50, 68, 98, 119, 139, 145, 149, 143, .....]
    p r.label
    # => 6
  end 

CIFAR1-100

> require 'datasets'
> c = Datasets::Cifar.new(class_num: 100)
> c.metadata
# => #<struct Datasets::Metadata name="CIFAR-100", url="https://www.cs.toronto.edu/~kriz/cifar.html", licenses=nil, description="CIFAR-100 is 32x32 image datasets">
> c.each do |r|                                                                                                                                                        
    p r.data
    # => [71, 74, 75, 76, 78, 81, 81, 78, ....]
    p r.label
    # => 23
  end 
hatappi commented 6 years ago

@kou Please review my PullRequest.

kou commented 6 years ago

Thanks. I've merged and am working on some improvements. I'll write summary to this after the work.

kou commented 6 years ago
kou commented 6 years ago

I'll do the followings:

kou commented 6 years ago

It may be better that we keep Record#data as byte string instead of an array of integer and Record#pixels returns an array of integer like the followings:

class Record < Struct.new(:data, :label)
  def pixels
    data.unpack("C*")
  end
end
kou commented 6 years ago

Record#to_gdk_pixbuf, Record#to_narray and so on may be useful.

kou commented 6 years ago

It may be better that we use Dataset::CIFAR instead of Cifar because the original dataset name is "CIFAR".

kou commented 6 years ago

e.g.: HTTP is better than Http for "HTTP".

kou commented 6 years ago

The usage in this pull request description is useful information. So we should put it to README.md or document comment in the source.

hatappi commented 6 years ago

@kou Thank you for comment. I will do the following.

  1. Cifar => CIFAR
  2. wirte README.md about CIFAR-10, 100
  3. Record#pixels
  4. set_type to type