Why read in increments of 16 bytes when looking for next good packet header?

Rup0rt / pcapfix

repair corrupted pcap files

http://f00l.de/pcapfix/

GNU General Public License v3.0

202 stars 34 forks source link

Why read in increments of 16 bytes when looking for next good packet header? #14

Closed assafmo closed 5 years ago

assafmo commented 5 years ago

https://github.com/Rup0rt/pcapfix/blob/20765dc4412f78633335169965000ecfcbb543ba/pcap.c#L506

Hi @Rup0rt, I'd love to know why you chose to read in increments of 16 bytes that aligns with the last good packet instead of increments of 1 bytes... Is there some smart logic behind this decision? It seems that this method can miss good packet.

assafmo commented 5 years ago

Also I'd love to know why you chose to mark packet with more than +-24 hours difference as corrupted.
I get the logic, but why not an hour? why not 10 minutes?

Thanks. :tada:

Rup0rt commented 5 years ago

Hi @Rup0rt, I'd love to know why you chose to read in increments of 16 bytes that aligns with the last good packet instead of increments of 1 bytes... Is there some smart logic behind this decision? It seems that this method can miss good packet.

The scan starts at pos+16+1 because POS is the position right after the last proper packet header (16 bytes) was found. (I called this behavior overlapping detection.) Pcapfix tries to find a valid next packet inside the data of the previous one. +16 bytes prevents scanning inside the last proper packets header, that has already been verified as correct.

I hope this explanation was understandable? :) If you encounter a problem with this loop, please let me know and maybe provide a sample file so that I can fix it and extend my test cases. Thanks!

Rup0rt commented 5 years ago

Also I'd love to know why you chose to mark packet with more than +-24 hours difference as corrupted. I get the logic, but why not an hour? why not 10 minutes?

I tried to find a reasonable duration between broken packets and a realistic time span. If you capture network traffic there is a chance that there is really no data on an interface for about 10 minutes or 1 hour (depending on the use case of the connection). So I tried to find a balance between broken packets and possible duration times.

I developed all cases from real pcap samples. If you think the time span is not well chosen, please let me know. Maybe I could add an option parameter to set the max time span manually. Thanks!

assafmo commented 5 years ago

Thank you for answering ☺