serde-rs / serde

Serialization framework for Rust
https://serde.rs/
Apache License 2.0
9.05k stars 767 forks source link

Beginner friendly example with json file #1195

Open mpizenberg opened 6 years ago

mpizenberg commented 6 years ago

Hi, I'm a total beginner in Rust and trying to use Serde for Json manipulation. I was looking for a way to write a Json decoder and decode some Json file on my disk. I did not find any example with the "read file" part. Maybe it's intentional, to separate concerns, but as a total beginner, having to switch context so often to put together a minimalist example is a bit tiresome.

I hope this beginner feedback can be useful. In any case, thank you for your work on Serde!

ZelphirKaltstahl commented 6 years ago

@mpizenberg I understand this a little, as I had once to search for this as well and here is some code, which might be useful for you, that I used in a blog:

Reading in all files inside a directory:

use std::fs;
use std::fs::File;
use std::io::BufReader;
use std::io::prelude::*;
use std::path::PathBuf;
use std::borrow::Cow;  // Cow = clone on write

pub fn read_file(filepath: &str) -> String {
    let file = File::open(filepath)
        .expect("could not open file");
    let mut buffered_reader = BufReader::new(file);
    let mut contents = String::new();
    let _number_of_bytes: usize = match buffered_reader.read_to_string(&mut contents) {
        Ok(number_of_bytes) => number_of_bytes,
        Err(_err) => 0
    };

    contents
}

pub fn read_post_from_file(filepath: &str) -> String {
    read_file(filepath)
}

// reads all blog posts in a directory
pub fn read_blog(directory_path: String) -> Vec<String> {
    let post_file_names: Vec<String> = get_all_blog_posts(directory_path.clone());
    let mut posts: Vec<String> = Vec::new();

    for file_name in post_file_names {
        let mut path = PathBuf::new();
        path.push(directory_path.clone());
        path.push(file_name);
        let unrendered_post: String = read_post_from_file(path.to_str().unwrap());
        let rendered_post: String = render_post(unrendered_post);
        posts.push(rendered_post);
    }
    posts
}

// https://stackoverflow.com/questions/31225745/iterate-over-stdfsreaddir-and-get-only-filenames-from-paths
fn get_all_blog_posts(directory_path: String) -> Vec<String> {
    let paths = fs::read_dir(directory_path).unwrap();
    paths.filter_map(|entry| {
        entry
            .ok()  // Converts from Result<T, E> to Option<T>.
            .and_then(  // Returns None if the option is None,
                        // otherwise calls f with the wrapped value and returns the result.
                | dir_entry | dir_entry  // a DirEntry
                    .path()  // becomes PathBuf
                    .file_name()  // Option<&OsStr>  // is either Some... or None...
                    .and_then(  // only if file_name is not a None...
                        | file_name | file_name
                            .to_str()  // Option<&str>
                            .map(| file_name_str | String::from(file_name_str))))
                            // only does something when it's not None
    }).collect::<Vec<String>>()
}

(I know, this reading all files from directory function looks insane … That's why I wrote all the comments there for myself.)

Hopefully I did not leave out anything critical and you can adapt this code to your own needs. It says "blog" in some names because in my case I was reading in markdown files. However, you should be able to read in json files the same way, giving you String(s) for the files. Once you got your structs for your types set up with SerDe, you should be able to use something along the lines:

let file_content: String = read_file(file_path);
let my_obj: MyType = serde_json::from_str(&file_content);
dtolnay commented 6 years ago

Hi! The documentation of std::fs::File gives an example of the "read file" part:

https://doc.rust-lang.org/std/fs/struct.File.html#examples

Read the contents of a file into a String:

use std::fs::File;
use std::io::prelude::*;

let mut file = File::open("foo.txt")?;
let mut contents = String::new();
file.read_to_string(&mut contents)?;
assert_eq!(contents, "Hello, world!");

Putting it together could look like:

#[macro_use]
extern crate serde_derive;

extern crate serde;
extern crate serde_json;

use std::error::Error;
use std::fs::File;
use std::io::Read;

#[derive(Deserialize, Debug)]
struct Mpizenberg {
    x: String,
}

fn main() {
    try_main().unwrap();
}

fn try_main() -> Result<(), Box<Error>> {
    // Read the input file to string.
    let mut file = File::open("input.json")?;
    let mut contents = String::new();
    file.read_to_string(&mut contents)?;

    // Deserialize and print Rust data structure.
    let data: Mpizenberg = serde_json::from_str(&contents)?;
    println!("{:#?}", data);

    Ok(())
}
mpizenberg commented 6 years ago

Thanks @ZelphirKaltstahl and @dtolnay ! I actually found out how to do it ^^ but still felt that it was worth mentionning.

To add my snippet to yours, here it is:

#[macro_use]
extern crate serde_derive;

extern crate serde;
extern crate serde_json;

use std::fs::File;
use std::path::Path;
use serde_json;

#[derive(Serialize, Deserialize)]
struct SomeDataType {}

fn main() {
    let json_file_path = Path::new("path/to/file.json");
    let json_file = File::open(json_file_path).expect("file not found");
    let deserialized_camera: SomeDataType =
        serde_json::from_reader(json_file).expect("error while reading json");
}

PS: Anyone "in charge" can close the issue. I don't want to clutter the issue board if that's not a priority but don't know if people might actually want to let it open.

entrptaher commented 4 years ago

Sorry for bumping old issue. But I was also looking for a beginner friendly example for same task.

I tried this following code to load a 100MB json file, but it hung up for around 60 seconds, while the python/crystal/nodejs takes 1 second to do the same.

Cargo.toml

[package]
name = "json-data-comparison"
version = "0.1.0"
edition = "2018"

[dependencies]
serde_json = "1.0.49"
serde_derive = "1.0.105"
serde = "1.0.105"

main.rs

#[macro_use]
extern crate serde_derive;
extern crate serde;
extern crate serde_json;

use std::time::Instant;
use std::fs::File;
use std::path::Path;

#[derive(Serialize, Deserialize)]
struct SomeDataType {}

fn main() {
    let json_file_path = Path::new("src/sample.json");
    let json_file = File::open(json_file_path).expect("file not found");
    let start = Instant::now();
    let _deserialized_camera: SomeDataType = serde_json::from_reader(json_file).expect("error while reading json");
    let elapsed = start.elapsed();
    println!("Elapsed: {:.2?}", elapsed);
}

If I use read_to_string and from_str, then it works within 1.25s, instead of 60 seconds.

#[macro_use]
extern crate serde_derive;
extern crate serde;
extern crate serde_json;

use std::time::Instant;
use std::fs::read_to_string; // use instead of std::fs::File
use std::path::Path;

#[derive(Serialize, Deserialize)]
struct SomeDataType {}

fn main() {
    let json_file_path = Path::new("src/sample.json");
    let json_file_str = read_to_string(json_file_path).expect("file not found");
    let start = Instant::now();
    // use instead of from_reader
    let _deserialized_camera: SomeDataType = serde_json::from_str(&json_file_str).expect("error while reading json");
    let elapsed = start.elapsed();
    println!("Elapsed: {:.2?}", elapsed);
}
dtolnay commented 4 years ago

@entrptaher please see https://docs.rs/serde_json/1/serde_json/fn.from_reader.html where it says to apply a BufReader to file i/o. As you saw, unbuffered file i/o (trying to read a file one byte at a time) is not good.

But read_to_string as shown in my comment above is going to be the fastest (or memmap if that is an option for you).

dtolnay commented 4 years ago

Locking because I don't want this to turn into a support issue.