Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
583 stars 250 forks source link

How to read JSON files with FileBasedExampleGen #40

Closed albertnanda closed 2 years ago

albertnanda commented 3 years ago

Please add code for reading json files.

hanneshapke commented 2 years ago

Hi @albertnanda,

You can reuse the same code as stated in the example and customize the following lines:

for image_filename in images:
     image_path = os.path.join(input_base_uri, image_filename)
     example = get_image_data(image_path)
     writer.write(example.SerializeToString())

with your custom code. For example, if you load one file and you would want to convert to TFRecords, the code could look like this (code is not tested):

import json

with open('data.json') as f:
     data = json.load(f)

for record in data:
     example = your_function_to_map_json_data_to_example(record)
     writer.write(example.SerializeToString())