Describe the bug
I was investigating why a URL, already included in IoCs exported from my collection, weren't recognized on a few submissions. It's turned out, the URL contains some upper characters while the version saved by Badlist updater was normalized to lower characters... but the normalization is not performed when matching against badlist. It looks like the manual adding items to Badlist also isn't normalized.
To Reproduce
Steps to reproduce the behavior:
Have a file with URL including an URL in both normalized and original form, e.g. a TXT file like:
Describe the bug I was investigating why a URL, already included in IoCs exported from my collection, weren't recognized on a few submissions. It's turned out, the URL contains some upper characters while the version saved by Badlist updater was normalized to lower characters... but the normalization is not performed when matching against badlist. It looks like the manual adding items to Badlist also isn't normalized.
To Reproduce Steps to reproduce the behavior:
Expected behavior Tags are matched regardless of the normalized or not form.
Screenshots
Environment (please complete the following information if pertinent):
Additional context
I have found the normalization in the updating server: https://github.com/CybercentreCanada/assemblyline-service-badlist/blob/3821ce750186704fd649609352ca822bf949877b/badlist/update_server.py#L158 But neither in Badlist client: https://github.com/CybercentreCanada/assemblyline-core/blob/06eb4c46f77be82e657de489d3d8d9350e709ea1/assemblyline_core/badlist_client.py#L139-L152 service API: https://github.com/CybercentreCanada/assemblyline-v4-service/blob/7dbbddc1cbedc8c3324c8a936b255552bd62fde6/assemblyline_v4_service/common/api.py#L140-L152 nor the badlist itself: https://github.com/CybercentreCanada/assemblyline-service-badlist/blob/3821ce750186704fd649609352ca822bf949877b/badlist/badlist.py#L97-L122
I suspect it may also be a case for file hashes and the safelist, but I haven't tested those cases