rust-pcap / pcap

Rust language pcap library
Apache License 2.0
598 stars 138 forks source link

Implement iterator & Change to stream & Utils #229

Closed Stargateur closed 1 year ago

Stargateur commented 2 years ago

There is a lot of change here I will explain my reason:

aviramha commented 2 years ago

+1 for The stream implement use a codec to "map" Packet avoid lifetime issue, I changed this trait cause there is no reason a user of pcap to use our pcap Error. If the user want to return an Error it need to return a Result as Item. When I implemented the trait I was like.. wtf?

Wojtek242 commented 2 years ago

what is our official Rust minimal version ? (I suggest we jump to 2021 edition aka 1.56)

A good rule of thumb would be to support all versions of rustc that are still supported on Debian. These versions are:

In terms of how long these need to be supported:

Since Debian LTS is not an official Debian supported distro (https://wiki.debian.org/LTS) we could ignore it. But I would prefer not to ignore it without a very good reason as I like Debian a lot and I like the idea of LTS.

Edit: so in summary it's either 1.41 or 1.48 as the minimal supported version.

Edit2: note that there is no such requirement for features. capture-stream is already not locked to this requirement.

Wojtek242 commented 2 years ago

The stream implement use a codec to "map" Packet avoid lifetime issue, I changed this trait cause there is no reason a user of pcap to use our pcap Error. If the user want to return an Error it need to return a Result as Item

Good point. Agreed.

Wojtek242 commented 2 years ago

I moved public item cause stream was our only public module, (personally unless a crates is VERY BIG (like tokio, std...) all should be in root if possible)

I would actually prefer many smaller files for the code itself, but I agree about all of the functionality being available from just pcap::XXX. I was even intending on breaking lib.rs apart into smaller files, but keeping the import paths the same.

Stargateur commented 2 years ago

I would actually prefer many smaller files for the code itself, but I agree about all of the functionality being available from just pcap::XXX. I was even intending on breaking lib.rs apart into smaller files, but keeping the import paths the same.

yes I wanted to avoid add this to this PR, but I very much agree with need more file I added some in this PR ^^

Wojtek242 commented 2 years ago

I see the tests pass, is this PR ready to look at (it's marked as a draft ATM)

Stargateur commented 2 years ago

need to add more test but you can at least review the general idea

Wojtek242 commented 2 years ago

I've started looking at this PR and I got stuck on PacketOwned. It doesn't appear to me like a good design pattern to just copy the same definition, but make all fields owned rather than references. Wouldn't this be solved much better if we changed the fields of Packet to Cow pointers? It would also correctly represent the semantics of the struct itself whereby the memory where the data lies is not owned by the caller and if they want to keep the data they need to clone it.

Wojtek242 commented 2 years ago

But generally, the changes look good. Besides the question above, I only have minor things like typos etc. to point out, but I'll wait with them until the PR is fully finished.

Stargateur commented 2 years ago

Wouldn't this be solved much better if we changed the fields of Packet to Cow pointers? It would also correctly represent the semantics of the struct itself whereby the memory where the data lies is not owned by the caller and if they want to keep the data they need to clone it.

I don't think I can do better, CoW is mostly only useful for 'static lifetime. I needed to get rid of lifetime to create a Iterator, CoW would bring back this lifetime so this wouldn't compile, plus CoW have a small cost that some user don't need to have.

It doesn't appear to me like a good design pattern to just copy the same definition, but make all fields owned rather than references.

It's not that common, but having "owned" version is not new to Rust String => &str, Vec<T> => &[T] etc. true that more often seen in application than in library but this exist only to make it easy for our user to use Iterator and Stream. Only std can do fat pointer so I can't really do better.

I'll wait with them until the PR is fully finished.

Will work on it late sunday.

Wojtek242 commented 2 years ago

It's not that common, but having "owned" version is not new to Rust String => &str, Vec => &[T] etc. true that more often seen in application than in library but this exist only to make it easy for our user to use Iterator and Stream. Only std can do fat pointer so I can't really do better.

Yes, I did start by thinking of those two examples, but there is no way we can implement them. Hence, I suggested Cow. I'll think about it a but more. I haven't tried getting it to compile with your iterator though.

Will work on it late sunday.

No rush, don't push yourself :-)

Wojtek242 commented 2 years ago

don't think I can do better, CoW is mostly only useful for 'static lifetime. I needed to get rid of lifetime to create a Iterator, CoW would bring back this lifetime so this wouldn't compile, plus CoW have a small cost that some user don't need to have.

Yes, having thought about it, having that lifetime parameter on Cow is not very convenient since the data can be used in a form that outlives the source, but the lifetime parameter will be carried along.

Hmm, for now I agree, I also I don't see a better way

Wojtek242 commented 2 years ago

don't think I can do better, CoW is mostly only useful for 'static lifetime. I needed to get rid of lifetime to create a Iterator, CoW would bring back this lifetime so this wouldn't compile, plus CoW have a small cost that some user don't need to have.

Yes, having thought about it, having that lifetime parameter on Cow is not very convenient since the data can be used in a form that outlives the source, but the lifetime parameter will be carried along.

Hmm, for now I agree, I also I don't see a better way

I played around with the code on your branch, and Cow can be used provided that Packet's into_owned takes self (no reference) and returns Packet<'static> which would then be (logically) equivalent to your PacketOwned. If one interprets 'static as meaning "this struct contains no references with a lifetime shorter than 'static" rather than "this struct contains references with 'static lifetime" then this makes sense.

In terms of overhead, apart from some memory (as Cow is an enum), the only extra performance overhead should be when calling into_owned as that has to first check if the data is borrowed first. Since you only call that when you don't mind copying, that shouldn't be an issue.

I'm just thinking out loud here. You don't have to add this to your PR.

Stargateur commented 2 years ago

Finally I don't have any idea to add test, I'm think we should merge this. Sorry, for the delay.

Stargateur commented 1 year ago

Notable change:

Wojtek242 commented 1 year ago

Brilliant! Thanks!