zentures / sequence

(Unmaintained) High performance sequential log analyzer and parser
http://sequencer.io
517 stars 72 forks source link

URI's starting with "//" are not tokenized correctly #15

Open Leftium opened 8 years ago

Leftium commented 8 years ago

Steps to Reproduce:

  1. echo "get //example.com" > input.txt
  2. go run sequence.go scan --input input.txt

Expected Results:

#   0: { Tag="funknown", Type="uri", Value="//example.com", ... }

Actual Results:

#   0: { Tag="funknown", Type="literal", Value="//example.com", ... }

Comments: I found this bug processing an actual log file. One of the log events in question:

81.181.146.13 - - [15/Mar/2005:05:06:49 -0500] "GET //cgi-bin/awstats/awstats.pl?configdir=|%20id%20| HTTP/1.1" 404 1050 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"

A related question: what is the best way to handle relative URI's? Sequence's heuristic algorithm for processing URI's breaks down on these...

leolee192 commented 4 years ago

Since I tried and couldn't contact the original author for weeks, I decided to migrate the project to leolee192/sequencer. Please visit leolee192/sequencer#10 for further activity, or to subscribe to receive notifications.