maxim5 / hyper-engine

Python library for Bayesian hyper-parameters optimization
https://pypi.python.org/pypi/hyperengine
Apache License 2.0
86 stars 22 forks source link

How to feed our own data? #9

Open ymcasky opened 6 years ago

ymcasky commented 6 years ago

How can I feed my own data instead of using mnist? Like the example in this post https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/5_DataManagement/build_an_image_dataset.ipynb Thanks for any help!

maxim5 commented 6 years ago

Hi @ymcasky

Here is an example of custom data provider: it's a simple interface and you basically need to implement next_batch method. Note however that the interface currently works with numpy arrays, tensorflow dataset API is not supported yet.

ymcasky commented 6 years ago

Dear @maxim5

Thanks for your reply! I have 2 question.

  1. The example you provide load whole data in numpy array then implement next_batch. How if my memory can't load whole data?

  2. The keras have api "flow_from_directory" with following example:

train_datagen = ImageDataGenerator(,
        horizontal_flip=True)
train_generator = train_datagen.flow_from_directory(directory=Imgpath,
                                       batch_size=Batch_SIZE, 
                                       shuffle=True, 
                                       target_size = (img_H, img_W))
(x_batch, y_batch) = train_generator.next() 

It is similar to your example but using .next() instead of .next_batch() Can I use this api and using your tool? Thanks for your help!

maxim5 commented 6 years ago

Hi @ymcasky ,

  1. Since you only need to provide next_batch you can load a new numpy array for each batch without holding the whole training set in memory. I'll make an example for this case.
  2. As far as I see from the source code, it's producing numpy arrays on each iteration, so yes, it must be compatible. Let me know the result if you try it.
ymcasky commented 6 years ago

ok, thank you!