Closed tw4l closed 3 years ago
Hi @ross-spencer - I believe this is now ready for re-review! I have addressed your inline comments, reworked the README, added an IMPLEMENTATION.md
(nice suggestion!), and added a Makefile
with commands to help publish the package to PyPI. Among the changes, the overview
command now outputs JSON, which I'm really enjoying.
A few specific comments I wanted to address:
Is there a preferred default argument that should be used without requiring a flag? I like the idea of that for just getting going - cat
| auditmatica > output_happens_here
I see what you mean, but the downside would be that running just auditmatica
would no longer display the help menu, with the list of available subcommands. So I'd rather leave it as-is if you're okay with it.
Cool that you've used type hinting in places too. I'd have been really interested to see the outcome if it is used more globally, is there a rationale for not using it more widely?
We talked about this a bit offline, but the inconsistent usage is because Python type expects you to specify the types of keys and values for dicts
, which would require a bit of an overhaul at this stage that I don't think would bring additional value besides making the type annotations happy. So I used type annotations for the simpler types like str
and bool
but not for dict
s.
One more note - I'd like to publish this to PyPI but am just waiting for code review to be complete before doing so :)
Connected to https://github.com/archivematica/Issues/issues/1341
This PR adds a Python
auditmatica
package that, as its primary function, generates Common Event Format (CEF) event logs from an Archivematica nginx access log. The main functions that do this work areauditmatica.access_log.parse_access_log_line
(which parses an nginx access log line into a formatted dictionary),auditmatica.access_log.add_event_info
(which compares the parsed line to Archivematica and Storage Service event mappings inauditmatica.access_log.events
and annotates the parsed line with event information) andauditmatica.cef.write_cef_event
(which writes the CEF event string).These are in turn used by a CLI, written using Click, that provides a simple user interface:
auditmatica write-cef [options] [LOG]
. The resulting CEF logs can be written to stdout (default), a file, or syslog.Storage Service log lines are distinguished from those of Archivematica through use of the
--ss-base-url
option.I also put much more minimal effort into a second CLI subcommand,
auditmatica overview [options] [LOG]
. This is intended as a demonstration of how the package could be used to gain insight into the use of an Archivematica instance.Edit: Adding that the Archivematica and Storage Service event mappings are incomplete. I tried to focus on the events that seeemed the most important from a security perspective (e.g. accessing stored data, adding or editing users, and editing configuration details). This Google doc provides some context into which events are (and are not yet) included: https://docs.google.com/document/d/1ufo9rlH7Gff9hvWcBYrxo0UDZ0wOmS9GlbD6U3Ll1AI/edit?usp=sharing. Right now that's only accessible to Artefactual staff. I'd be interested in your thoughts @ross-spencer on whether a distilled version of that document should live in this repo (maybe as a CSV?) so we can track progress moving forward.