alexcrichton / tar-rs

Tar file reading/writing for Rust
https://docs.rs/tar
Apache License 2.0
626 stars 185 forks source link

Write pax headers in Builder #102

Open dswd opened 7 years ago

dswd commented 7 years ago

It would be nice if the builder could also create tar files with some pax headers set. If that option already exists, I have not found it.

alexcrichton commented 7 years ago

Ah yeah currently this isn't explicitly support, you'd have to build up each Entry manually to insert them. Would love to support it though!

dswd commented 7 years ago

How would I build such an entry with pax headers? Do I need to encode them in a special way and the resulting bytes to the Builder?

alexcrichton commented 7 years ago

Currently you'd have to create the Header instances manually and then use the append api to append headers one-by-one. Not exactly the greatest api :(

dswd commented 7 years ago

Ok, I found some documentation on how to encode pax headers. So basically I have to 1) create a file entry with the type set to XHeader 2) set the file data to the pax headers encoded as <length> <key>=<value>\n where length is the total length of the encoded pax record including the newline and the length field itself encoded as decimal (Who designs such a crap?)

Is that correct?

alexcrichton commented 7 years ago

Sounds about right to me! You can take a look at the code which parses pax headers in this repo as well for a reference.

Note that I'd be more than willing to accept a PR to add support for encoding pax headers!

dswd commented 7 years ago

Hi, not really a PR (sorry, my time is extremely limited currently) but this is what I am working with right now:

struct PaxBuilder(Vec<u8>);

impl PaxBuilder {
    pub fn new() -> Self {
        PaxBuilder(Vec::new())
    }

    pub fn add(&mut self, key: &str, value: &str) {
        let mut len_len = 1;
        let mut max_len = 10;
        let rest_len = 3 + key.len() + value.len();
        while rest_len + len_len >= max_len {
            len_len += 1;
            max_len *= 10;
        }
        let len = rest_len + len_len;
        write!(&mut self.0, "{} {}={}\n", len, key, value).unwrap();
    }

    fn as_bytes(&self) -> &[u8] {
        &self.0
    }
}

trait BuilderExt {
    fn append_pax_extensions(&mut self, headers: &PaxBuilder) -> Result<(), io::Error>;
}

impl<T: Write> BuilderExt for tar::Builder<T> {
    fn append_pax_extensions(&mut self, headers: &PaxBuilder) -> Result<(), io::Error> {
        let mut header = tar::Header::new_ustar();
        header.set_size(headers.as_bytes().len() as u64);
        header.set_entry_type(tar::EntryType::XHeader);
        header.set_cksum();
        self.append(&header, headers.as_bytes())
    }
}
alexcrichton commented 7 years ago

Nice! That looks pretty reasonable to me.

SoniEx2 commented 6 years ago

Hmm...

Why not make pax headers include a whole tar file?

Pax headers are basically just "fat" headers, so you can simply make it so

OnePaxHeader = TarHeader PaxMetadata TarHeader

and when you write a pax header it writes two tar headers but it feels seamless with no need to special case anything.

(assuming I understand pax correctly)

y'know, abstract away until it's abstracted away completely.

erikh commented 4 years ago

Hi, I riffed on this and hopefully came up with something you can import easily, but no real tests to speak of: https://gist.github.com/erikh/c0a5aa9fde317ec9589271e78c78783c

Hope it's useful to someone.

gwitrand-ovh commented 2 weeks ago

Hey, needed this feature so I made a very simple version https://github.com/alexcrichton/tar-rs/pull/382