Error using data-conversion/scenario_conversion parsing text-format waymo.open_dataset. Scenario: 1:2: Interpreting non-ascii codepoint 192

I am encountering an error while using the Waymo Open Dataset conversion library to convert a Waymo scenario file to a TensorFlow Example and export it to a tfrecord file. When running my code using bazel build, I get the following error:

Error parsing text-format waymo.open_dataset.Scenario: 1:2: Interpreting non-ascii codepoint 192

I suspect the issue might be related to the encoding of the input file. Specifically, I am reading the scenario file in binary mode, but it might be encoded in a non-standard format. Any suggestions would be helpful.

Here's a code snippet that reproduces the error:


#include <vector>
#include <string>
#include <fstream> 

#include "waymo_open_dataset/data_conversion/scenario_conversion.h"
#include "waymo_open_dataset/protos/conversion_config.pb.h"

#include "absl/strings/str_cat.h"
#include "tensorflow/core/example/example.pb.h"
#include "tensorflow/core/lib/io/record_writer.h"
#include "tensorflow/core/platform/env.h"

#include "google/protobuf/text_format.h"
#include "google/protobuf/io/zero_copy_stream_impl.h"

int main() {
    // Load the input data from file (assuming scenario.pbtxt and config.pbtxt exist).
    std::string scenario_file_path = 
    std::string scenario_file_path = "path/to/waymo_open_dataset_motion_v_1_2_0/training_20s/training_20s.tfrecord-00667-of-01000";
    waymo::open_dataset::Scenario scenario;
    // Read the scenario from file.
    std::ifstream input(scenario_file_path, std::ios::in | std::ios::binary);
    if (!input) {
        std::cerr << "Failed to open " << scenario_file_path << std::endl;
        return 1;
    }

    // Print the contents_print of the file.
    std::stringstream buffer;
    buffer << input.rdbuf();
    std::string contents = buffer.str();

    // parse the text format from the string
    if (!google::protobuf::TextFormat::ParseFromString(
        contents, &scenario)) {
        std::cerr << "Failed to parse " << scenario_file_path << std::endl;
        return 1;
    }
    waymo::open_dataset::MotionExampleConversionConfig config;

    // Convert the scenario to a TensorFlow Example.
    std::map<std::string, int> counters;

    absl::StatusOr<tensorflow::Example> status_or_example =
        waymo::open_dataset::ScenarioToExample(scenario, config, &counters);
    if (!status_or_example.ok()) {
        std::cerr << "Failed to convert scenario to Example: "
                << status_or_example.status().message() << std::endl;
        return 1;
    }
    tensorflow::Example example = status_or_example.value();

    //  Output files to tfrecord
    // Create a new writable file
    tensorflow::Env* env = tensorflow::Env::Default();
    std::unique_ptr<tensorflow::WritableFile> file;
    std::string file_name = "example.tfrecord";
    env->NewWritableFile(file_name, &file);

    // Create a record writer and write the example to file
    tensorflow::io::RecordWriterOptions options = tensorflow::io::RecordWriterOptions::CreateRecordWriterOptions("");
    tensorflow::io::RecordWriter writer(file.get(), options);
    std::string example_string;
    example.SerializeToString(&example_string);
    writer.WriteRecord(example_string);

    // Close the file and output success message
    file->Close();
    std::cout << "Example exported to " << file_name << std::endl;
    std::cout << "Example exported to output.tfrecord." << std::endl;
    return 0;
}     ```

waymo-research / waymo-open-dataset

Error using data-conversion/scenario_conversion parsing text-format waymo.open_dataset. Scenario: 1:2: Interpreting non-ascii codepoint 192 #653