simsong / tcpflow

TCP/IP packet demultiplexer. Download from:
http://downloads.digitalcorpora.org/downloads/tcpflow/
GNU General Public License v3.0
1.68k stars 237 forks source link

Timestamp format in report.xml drops 0 fraction #206

Open govcert-ch opened 5 years ago

govcert-ch commented 5 years ago

Hi,

This is not really a bug, but I don't think it's planned this way neither. In report.xml, timestamps (startime/endtime) are usually rendered with 6 digits precision, e.g. 2019-01-18T17:54:49.637515Z. This can be parsed in Python with something like

ts = datetime.datetime.strptime(tsStrg, '%Y-%m-%dT%H:%M:%S.%fZ')"

However, whenever the fraction is ".000000" (so in 1 of 1 million cases in the average), that is dropped, as for example in the line (from a real pcap)

<tcpflow startime='2019-01-18T17:54:48Z' endtime='2019-01-18T17:54:49.637515Z' ...

The problem is that above strptime fails in this case. Because it happens rarely, the problem can strike pretty late. The only workaround I found is using something like

try:
  ts = datetime.datetime.strptime(tsStrg, '%Y-%m-%dT%H:%M:%S.%fZ')
except ValueError:
  ts = datetime.datetime.strptime(tsStrg, '%Y-%m-%dT%H:%M:%SZ')

I think the better solution would be if tcpflow would not suppress the 0's.

It could be fixed (if this is considered as unwanted behaviour) in dfxml/src/dfxml_writer.h by removing lines 212 and 215 (making appending the fraction independent of the (ts.tv_usec>0) condition)

simsong commented 5 years ago

If you wish to send a pull request, we will accept it!

On Mar 4, 2019, at 10:26 AM, GovCERT.ch notifications@github.com wrote:

Hi,

This is not really a bug, but I don't think it's planned this way neither. In report.xml, timestamps (startime/endtime) are usually rendered with 6 digits precision, e.g. 2019-01-18T17:54:49.637515Z. This can be parsed in Python with something like

ts = datetime.datetime.strptime(tsStrg, '%Y-%m-%dT%H:%M:%S.%fZ')" However, whenever the fraction is ".000000" (so in 1 of 1 million cases in the average), that is dropped, as for example in the line (from a real pcap)

<tcpflow startime='2019-01-18T17:54:48Z' endtime='2019-01-18T17:54:49.637515Z' ... The problem is that above strptime fails in this case. Because it happens rarely, the problem can strike pretty late. The only workaround I found is using something like

try: ts = datetime.datetime.strptime(tsStrg, '%Y-%m-%dT%H:%M:%S.%fZ') except ValueError: ts = datetime.datetime.strptime(tsStrg, '%Y-%m-%dT%H:%M:%SZ') I think the better solution would be if tcpflow would not suppress the 0's.

It could be fixed (if this is considered as unwanted behaviour) in dfxml/src/dfxml_writer.h by removing lines 212 and 215 (making appending the fraction independent of the (ts.tv_usec>0) condition)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/simsong/tcpflow/issues/206, or mute the thread https://github.com/notifications/unsubscribe-auth/ABhTrHlnhxdXYY5gXWzkghXZLfP8P-yGks5vTTszgaJpZM4bclbI.