domainaware / parsedmarc

A Python package and CLI for parsing aggregate and forensic DMARC reports
https://domainaware.github.io/parsedmarc/
Apache License 2.0
981 stars 212 forks source link

8.3.1 fails if output files don't exist #358

Closed jdghub closed 1 year ago

jdghub commented 1 year ago

This problem appeared in 8.3.1 and didn't occur in 8.3.0.

Running under Windows, and seen with both regular and portable Python 3.10 installs. ParseDMARC pd.ini file includes:

[general]
save_aggregate = True
save_forensic = True
debug = True
output = X:\Scripts\ParseDMARC\Logs\20221003
aggregate_json_filename = agg.json
forensic_json_filename = for.json
aggregate_csv_filename = agg.csv
forensic_csv_filename = for.csv

When run ParseDMARC 8.3.1 successfully processes the contents of the inbox, but then exits with

X:\Scripts\ParseDMARC>python3\scripts\parsedmarc.exe -c .\pd.ini
    INFO:cli.py:777:Starting parsedmarc
0it [00:00, ?it/s]
  [process Inbox reports...]
Traceback (most recent call last):
  File "runpy.py", line 196, in _run_module_as_main
  File "runpy.py", line 86, in _run_code
  File "X:\Scripts\ParseDMARC\python3\scripts\parsedmarc.exe\__main__.py", line 7, in <module>
  File "X:\Scripts\ParseDMARC\Python3\Lib\site-packages\parsedmarc\cli.py", line 975, in _main
    process_reports(results)
  File "X:\Scripts\ParseDMARC\Python3\Lib\site-packages\parsedmarc\cli.py", line 81, in process_reports
    save_output(results, output_directory=opts.output,
  File "X:\Scripts\ParseDMARC\Python3\Lib\site-packages\parsedmarc\__init__.py", line 1325, in save_output
    append_json(os.path.join(output_directory, aggregate_json_filename),
  File "X:\Scripts\ParseDMARC\Python3\Lib\site-packages\parsedmarc\__init__.py", line 1265, in append_json
    with open(filename, "r+", newline="\n", encoding="utf-8") as output:
FileNotFoundError: [Errno 2] No such file or directory: 'X:\\Scripts\\ParseDMARC\\Logs\\20221003\\agg.json'

The problem occurs accessing via both IMAP and MS Graph, and the workaround is to create the 4 empty output files before running ParseDMARC. Not a big problem, but should not be necessary, and was not necessary with 8.3.0.

jdghub commented 1 year ago

And not noticed until now: This also results in the CSV files not getting the initial header row (presumably written at the same time the CSV files are supposed to be created.)

bhandscomb commented 1 year ago

OK, looks like this is a result of pull 342 to fix issue 226 intention being for various reasons to append to existing files.

Problem is the "r+" mode doesn't create a file if it doesn't exist so it fails...

As for header row missing, if the file exists and is of non-zero length the headers do not get written because the code believes the headers to already be there... If you "touch" to create a file so it exists with zero length the "r+" option works fine.

If the option is changed to "a+" results are appended if the file exists and if the file does not exist it is created. This would appear to be what is intended. This "r+" to "a+" is probably needed in both append_json and append_csv

Pascal76 commented 1 year ago

still not fixed ? :(

javadovjavad commented 1 year ago

When will this issue be resolved?

seanthegeek commented 1 year ago

This issue is finally fixed in 8.4.1. Sorry this bug was neglected for so long.

jdghub commented 1 year ago

Thanks!