apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.62k stars 802 forks source link

LocalFileSystem errors with satisfiable range request #6749

Open kylebarron opened 3 days ago

kylebarron commented 3 days ago

Describe the bug

The LocalFileSystem fails to return data when a byte range starts before the end of the file.

To Reproduce

    #[tokio::test]
    async fn range_request_beyond_end_of_file() {
        let root = TempDir::new().unwrap();
        let integration = LocalFileSystem::new_with_prefix(root.path()).unwrap();

        let location = Path::from("some_file");

        let data = Bytes::from("arbitrary data");

        integration
            .put(&location, data.clone().into())
            .await
            .unwrap();

        let read_data = integration.get_range(&location, 0..100).await.unwrap();
        assert_eq!(&*read_data, data);
    }

This currently fails with

called `Result::unwrap()` on an `Err` value: Generic { store: "LocalFileSystem", source: OutOfRange { path: "/private/var/folders/42/5jr6891d4ds4xysz7q0rsghw0000gn/T/.tmpK2eHCc/some_file", expected: 100, actual: 14 } }

Expected behavior

In line with the HTTP spec and other object stores, the LocalFileSystem should return the satisfiable part of the range request.

Additional context

Some discussion in Discord: https://discord.com/channels/885562378132000778/885562378132000781/1308082632898248836