Closed udi-aharon closed 1 year ago
Thank you @udi-aharon for the note. The traffic used in the legit data-set is based on real-world traffic that surprisingly included the 'utf8' string. We do acknowledge that it should be 'utf-8' (and apparently also the app developers changed it. One explanation may be usage of certin dotnet libraries that were updated). In any case, we will manually change it next time we update the dataset with real-world traffic. Thanks again.
The following
Content-Type
header appear in the Legitimate dataset which are not valid and should be marked as Malicious.text/plain;charset=utf8
:application/x-www-form-urlencoded;charset=utf-8;
:application/json; charset=utf8
:Number of cases per source: