TotalWar-Modding / tw_pack_lib

1 stars 0 forks source link

Improve readability/maintainability/docs. #5

Open Frodo45127 opened 5 years ago

Frodo45127 commented 5 years ago

Right now docs are non-existant. I've started with them, but nothing is yet finished.

About readability/maintainability, I've a suggestion: Reduce abstractions (custom types/structs) to a minimum. Right now, while doing the comments for some functions I have to pass through 9 abstractions in two different crates (without counting hashmaps/arcs/mutex) to just know what data is in a PackedFile by looking at the code. Some of them are valid, like if it's data already loaded or to load, but others like making a struct for one thing are not, and just make the code harder to read. An example of it is this:

pub struct PackedFile {
    pub timestamp: Option<u32>,
    pub path: String,
    data: Mutex<PackedFileData>
}

pub(crate) struct PackedFileData {
    inner: PackedFileDataType
}

#[derive(Clone)]
pub(crate) enum PackedFileDataType {
    DataBacked(Arc<Vec<u8>>),
    LazyLoading(LazyLoadingPackedFile)
}

Why does that PackedFileData exists in the first place when you can do this:

pub struct PackedFile {
    pub timestamp: Option<u32>,
    pub path: String,
    data: Mutex<PackedFileData>
}

#[derive(Clone)]
pub(crate) enum PackedFileData {
    DataBacked(Arc<Vec<u8>>),
    LazyLoading(LazyLoadingPackedFile)
}

Another one: Keep the use/structs/enums/functions structured. For example, first uses, then enums/structs, then functions. I'm currently ordering the lib.rs file because having a type and an use in the middle of all the impls is not intuitive, at all.

Also, a bit offtopic, there is any way to talk to you more quickly, like a chat, discord,...? I've deliberately stopped doing big changes in RPFM so we can finish this first without extra compatibility issues, and it's taking too long, mainly because this is quite a slow way to communicate.

Trolldemorted commented 5 years ago

Why does that PackedFileData exists in the first place when you can do this:

Yeah that should work too if we can still replace it in get_data like we do now, but I guess it should be possible

Another one: Keep the use/structs/enums/functions structured. For example, first uses, then enums/structs, then functions. I'm currently ordering the lib.rs file because having a type and an use in the middle of all the impls is not intuitive, at all.

👍yeah a bit of ordering is definitely needed

Also, a bit offtopic, there is any way to talk to you more quickly, like a chat, discord,...? I've deliberately stopped doing big changes in RPFM so we can finish this first without extra compatibility issues, and it's taking too long, mainly because this is quite a slow way to communicate.

I have discord (because tca uses it for twa), but you can reach me faster and more conveniently on signal - just send me an email with what you prefer and I'll send you my account

Frodo45127 commented 5 years ago

I have discord (because tca uses it for twa), but you can reach me faster and more conveniently on signal - just send me an email with what you prefer and I'll send you my account

I sent you my discord nick to the mail in cargo.toml, just in case you don't recognize my mail address.

Back to the topic:

pub struct PackedFile {
    pub timestamp: Option<u32>,
    pub path: String,
    data: Mutex<PackedFileData>
}

Why does that data needs a Mutex? If neither the lib nor the programs that currently uses it (tw_unpack, and RPFM with the patch) uses that data in multiple threads, it's really needed?

Also, any reason why some of the functions do this?

    pub fn get_bitmask(&self) -> ::PFHFlags {
        parse::get_bitmask(&self.view)
    }

If we are just going to call a function and report their result unaltered... why not making the interior function (parse::get_bitmask(&self.view) for example) public directly?

EDIT: One more thing I forgot to ask. Is there a difference between Borrow<PackedFile> and &PackedFile? I´ve seen it in a couple of places and it seems to me they do the same thing.

Trolldemorted commented 5 years ago

I sent you my discord nick to the mail in cargo.toml, just in case you don't recognize my mail address.

I managed to send you a friend request I think.

Why does that data needs a Mutex? If neither the lib nor the programs that currently uses it (tw_unpack, and RPFM with the patch) uses that data in multiple threads, it's really needed?

Without the mutex you'd have to make get_data take &mut self instead of &self, because rust prevents you from altering data without a mutable reference since it cannot guarantee to prevent data races otherwise. RPFM with the patch should share the data between the bg and ui thread, though - b75bc4ff9233c58bf4218f2192710d665a111dfb should clone the arc which owns the data, not the data itself.

If we are just going to call a function and report their result unaltered... why not making the interior function (parse::get_bitmask(&self.view) for example) public directly?

That would be possible (and even result in less lines of code), but it would make the public api more complicated: Right now, from the users' perspective tw_pack_lib is one module that offers everything, if we change parse and build to public modules they will appear as such to the user. Getters like get_bitmask need a file view reference, so you'd have to make the view field public, and force users to pass it to get_bitmask. Also, you might want to keep the public api (impl PackFile) separated from the actual implementation (::parse).

EDIT: One more thing I forgot to ask. Is there a difference between Borrow and &PackedFile? I´ve seen it in a couple of places and it seems to me they do the same thing.

Borrow and AsRef are used to deal with borrowed and owned versions of the same type. So the following function

fn foo<P: Borrow<PackedFile>>(input: &Vec<P>)

may either take a Vec<PackedFile> or a Vec<&PackedFile>.

Frodo45127 commented 5 years ago

I managed to send you a friend request I think.

I´ll add you tomorrow. Too tired now.

RPFM with the patch should share the data between the bg and ui thread

Nop. In RPFM, the UI should only receive decoded data, not raw data. All the decoding/encoding happens in the background so the UI never sees the real data, only the data that has been prepared for the UI. The only exception is the DB Decoder, and there it just needs the first bytes, no the entire data, so a vector with those bytes is enough.

That would be possible (and even result in less lines of code), but it would make the public api more complicated: Right now, from the users' perspective tw_pack_lib is one module that offers everything, if we change parse and build to public modules they will appear as such to the user. Getters like get_bitmask need a file view reference, so you'd have to make the view field public, and force users to pass it to get_bitmask. Also, you might want to keep the public api (impl PackFile) separated from the actual implementation (::parse).

And why are separated to begin with? From my point of view, all those functions should really be inside impl PackFile, as they NEED a PackFile to begin with, or they use or return one. And I have the same problem with the PackFile that I had with PackedFileData: why is a PackFile a struct with just one thing inside from another crate? Can´t we just say PackFile = FileView and make a trait PackFile to implement all that stuff directly into FileView or something like that? That´ll eliminate all those middle functions and reduce the abstraction by another layer.

may either take a Vec or a Vec<&PackedFile>.

Thanks. Another thing I learnt today.

Trolldemorted commented 5 years ago

Nop. In RPFM, the UI should only receive decoded data, not raw data. All the decoding/encoding happens in the background so the UI never sees the real data, only the data that has been prepared for the UI. The only exception is the DB Decoder, and there it just needs the first bytes, no the entire data, so a vector with those bytes is enough.

Oh that is unfortunate. But the thread-safety comes for free, luckily.

And why are separated to begin with? From my point of view, all those functions should really be inside impl PackFile, as they NEED a PackFile to begin with, or they use or return one. And I have the same problem with the PackFile that I had with PackedFileData: why is a PackFile a struct with just one thing inside from another crate? Can´t we just say PackFile = FileView and make a trait PackFile to implement all that stuff directly into FileView or something like that? That´ll eliminate all those middle functions and reduce the abstraction by another layer.

Separating API from implementation makes sense because it helps you defining a decisive API without having specific implementation details influencing it. We could make a PackFile trait, but that would change the API since all FileView functions would be available too, which would violate the separation of concerns - tw_pack_lib should provide ways to manipulate pack files, and should not expose internals. If you drop the wrapper we would have to re-export our dependency, and users would have to import something from a crate they should not be forced to know about to move the PackFile into a variable (since you have to declare an explicit type), or use boxed types and dynamic dispatch or impl traits.