SIDN / CycleHunter

Python software that reads zone files, extract NS records, and detect cyclic dependencies
https://tsuname.io
BSD 2-Clause "Simplified" License
37 stars 14 forks source link

FileNotFoundError: [Errno 2] No such file or directory: ....step4.json #3

Closed dirkvdplas closed 3 years ago

dirkvdplas commented 3 years ago

Hello,

Seems like there is a bug somewhere in the script I get a file not found error when running the script. I tried the following:

See error message below:

` venv) user@server:/tmp/CycleHunter$ ./CycleHunter.py --zonefile /tmp/cyclic/xxx.com --origin xxx.com --save-file /tmp/cyclic/xxx_com_output 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 62.02it/s] und jetz? Step 1: read timed out zones Step 2: create Authority objects Step 3: get only zones without in-bailiwick/in-zone authoritative servers Step 4: sort which ones are cyclic step 7: writing down results step 1: read cyclic domains Traceback (most recent call last): File "/tmp/CycleHunter/zoneMatcher.py", line 104, in getCyclic with open(infile, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: 'xxx.com.2021-02-08.step4.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./CycleHunter.py", line 51, in zone_matcher(cyclic_domain_file=output4, zonefile=args.zonefile, zoneorigin=args.origin, output_file=args.save_file) File "/tmp/CycleHunter/zoneMatcher.py", line 123, in zone_matcher cyclic = getCyclic(cyclic_domain_file) File "/tmp/CycleHunter/zoneMatcher.py", line 108, in getCyclic with open(infile, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: 'xxx.com.2021-02-08.step4.json' `

Best regards, Dirk

gmmoura commented 3 years ago

thanks for reporting, Dirk.

I added some lines to handle file exceptions. Please pull it again and retry. (I tested and it worked here)

ps: the short answer is that there was no cyclic dependent NS records, so an empty file was written

dirkvdplas commented 3 years ago

Hello Giovane,

You're welcome. I tried again after pulling your update, but unfortunately same error appears.

I am working in a python virtualenv but should not matter right?

On Mon, Feb 8, 2021 at 10:01 AM Giovane Moura notifications@github.com wrote:

thanks for reporting, Dirk.

I've created a branch named dev and added a bunch of file exceptions handlers.

Could you please pull dev and try to run it again, and let me know if it works?

thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SIDN/CycleHunter/issues/3#issuecomment-774987413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6FMIR75Y5AI3NGJLDXA4TS56R5XANCNFSM4XIPKCMA .

dirkvdplas commented 3 years ago

Sorry, I forgot to include the output: I also added another issue

python3 ./CycleHunter.py --zonefile XXX.com --origin .XXX.com --save-file dirk

[root@test CycleHunter]# python3 ./CycleHunter.py --zonefile XXX.com --origin .XXX.com --save-file dirk 100%|#############################################################################################################################| 21/21 [00:01<00:00, 19.61it/s] und jetz? Step 1: read timed out zones Step 2: create Authority objects Step 3: get only zones without in-bailiwick/in-zone authoritative servers Step 4: sort which ones are cyclic step 7: writing down results step 1: read cyclic domains Traceback (most recent call last): File "/root/CycleHunter/zoneMatcher.py", line 104, in getCyclic with open(infile, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: 'XXX.com.2021-02-08.step4.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./CycleHunter.py", line 66, in zone_matcher(cyclic_domain_file=output4, zonefile=args.zonefile, zoneorigin=args.origin, output_file=args.save_file) File "/root/CycleHunter/zoneMatcher.py", line 123, in zone_matcher cyclic = getCyclic(cyclic_domain_file) File "/root/CycleHunter/zoneMatcher.py", line 108, in getCyclic with open(infile, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: 'XXX.com.2021-02-08.step4.json'

2nd issue:

Simple zone file like this (not that this I anonimized the output)

YYYtravel.com. 86400 IN SOA ns1.MYCOMPANY.net. zonemaster.MYCOMPANY.com. 2020012002 28800 7200 604800 86400 YYYtravel.com. 3600 IN A 6.6.6.6 YYYtravel.com. 86400 IN NS ns1.MYCOMPANY.net. YYYtravel.com. 86400 IN NS ns2.MYCOMPANY.net. localhost.YYYtravel.com. 86400 IN A 127.0.0.1 www.YYYtravel.com. 3600 IN CNAME redirect.MYCOMPANYis.nl. YYYtravel.com. 86400 IN SOA ns11.MYCOMPANY.net. zonemaster.MYCOMPANY.com. 2020012002 28800 7200 604800 86400

python3 ./CycleHunter.py --zonefile YYYtravel.com --origin .YYYtravel.com --save-file dirk YYYtravel.com.2021-02-08.step1.txt has no NS records; stop here. Plase check if largeZoneParser correctly parsers your zone file

This zonefile is parsed correctly with dnspython if I try myself. So no clue why your script does not detect NS records

z = dns.zone.from_file('YYYtravel.com', 'YYYtravel.com', relativize=False) names = z.nodes.keys() for n in names: ... for rr_set in z[n].rdatasets: ... print(rr_set) ... 86400 IN SOA ns11.MYCOMPANY.net. zonemaster.MYCOMPANY.com. 2020012002 28800 7200 604800 86400 3600 IN A 6.6.6.6 86400 IN NS ns1.MYCOMPANY.net. 86400 IN NS ns2.MYCOMPANY.net. 86400 IN A 127.0.0.1 3600 IN CNAME redirect.MYCOMPANYis.nl.

On Mon, Feb 8, 2021 at 12:45 PM Dirk van der Plas dirkvdplas@gmail.com wrote:

Hello Giovane,

You're welcome. I tried again after pulling your update, but unfortunately same error appears.

I am working in a python virtualenv but should not matter right?

On Mon, Feb 8, 2021 at 10:01 AM Giovane Moura notifications@github.com wrote:

thanks for reporting, Dirk.

I've created a branch named dev and added a bunch of file exceptions handlers.

Could you please pull dev and try to run it again, and let me know if it works?

thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SIDN/CycleHunter/issues/3#issuecomment-774987413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6FMIR75Y5AI3NGJLDXA4TS56R5XANCNFSM4XIPKCMA .

gmmoura commented 3 years ago

YYYtravel.com.2021-02-08.step1.txt has no NS records; stop here. Please check if largeZoneParser correctly parsers your zone file

So we don't use any lib to read the zone files, because they can be very slow with large zone files, as .com.

So we do read the file line by line, and there is a large variation in the way the zone files are actually written. And I don't think there is a one-size-fits-all solution here.

So I recomend the following (for this zone):

ps: there is one pull-request that customizes for their own zone. So some other folks had this issue

dirkvdplas commented 3 years ago

Hey Giovane,

Just a small change to better handle zonefiles as some of then are split on \t and others without (at least in our environment). split() splits on whitespace in general. I just sent you a PR for this small improvent.

After fixing that one I ended up with the error as initially reported

FileNotFoundError: [Errno 2] No such file or directory: 'XXXX.2021-02-08.step4.json'

Any idea about that?

Dirk

On Mon, Feb 8, 2021 at 1:19 PM Giovane Moura notifications@github.com wrote:

YYYtravel.com.2021-02-08.step1.txt has no NS records; stop here. Please check if largeZoneParser correctly parsers your zone file

So we don't use any lib to read the zone files, because they can be very slow with large zone files, as .com.

So we do read the file line by line, and there is a large variation in the way the zone files are actually written. And I don't think there is a one-size-fits-all solution here.

So I recomend the following (for this zone):

  • Debug largeZoneParser and make sure it is actually reading the records, and adjusted it your zone.
  • After that, run the other steps manually (or the whole thing again automatically with CycleHunter.py

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SIDN/CycleHunter/issues/3#issuecomment-775105892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6FMIXYNRL6OG4XVVQW6FLS57JDXANCNFSM4XIPKCMA .

gmmoura commented 3 years ago

thanks for the PR, as the other PR, they may be your setup-specific. (so I am not sure what to do with it ATM).

I added another exception handler;

Just to be sure, could you please:

  1. re-run the whole thing manually, as the step-by-step tutorial
  2. see if the *step4.json actually has some content. If it does not, it is because CycleHunter did not find any cyclic dependencies, and you're OK then
dirkvdplas commented 3 years ago

I rerun all steps manually:

root@test CycleHunter]# cat YYYcom.2021-02-08.step3.json {"partialDep": {}, "fullDep": {}, "fullDepWithInzone":

YYYcom.2021-02-08.step4.json -> not generated

When running all steps in CycleHunter.py the output is as follows:

[root@test CycleHunter]# python3 CycleHunter.py --zonefile YYY.com --origin .YYY.com --save-file output 100%|#############################################################################################################################| 21/21 [00:01<00:00, 17.54it/s] und jetz? Step 1: read timed out zones Step 2: create Authority objects Step 3: get only zones without in-bailiwick/in-zone authoritative servers Step 4: sort which ones are cyclic step 7: writing down results step 8: read cyclic domains YYY.com.2021-02-08.step4.json does not exist; exiting

P.S There is another small bug in findCyclicDep.py line 960 (cycle-output should be cycle_output)

gmmoura commented 3 years ago

thanks! i think you're good, you don't seem to have cyclic dependencies :+1:

Notice that YYYcom.2021-02-08.step4.json is only written IF there are cyclic dependencies

does that make sense to you? (I'll improve the error log message, and fix the bug)

You can try manually validate that, by sampling some of the step3.json domains and see if that's the case

gmmoura commented 3 years ago

Shall I close this issue?

dirkvdplas commented 3 years ago

I tested branch pr5. It's better now. Still I have issues parsing parsing zonefiles with your version. (with my version I am able to parse both). Also for a domain with cyclic dependency the zonematcher is still failing. See output below. This does not occur with my version, so I would appreciate if you can compare your version of zonematcher against mine:

(venv) user@ubuntu-s-1vcpu-1gb-ams3-01:/tmp/CycleHunter$ python CycleHunter.py --zonefile DOMAINXXX.com --origin DOMAINXXX.com --save-file XXXcom 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 53.06it/s] und jetz? Step 1: read timed out zones Step 2: create Authority objects Step 3: get only zones without in-bailiwick/in-zone authoritative servers Step 4: sort which ones are cyclic analyzying test.XXXtest.nl.. Domain 1 from 1 All nameservers failed to answer the query test12.DOMAINXXX.com. IN SOA: Server 1.1.1.1 UDP port 53 answered SERVFAIL step 7: writing down results step 8: read cyclic domains step 8a: read zone file and find them step 8b: writing it to json ERROR: could not match domain names to NS records; please check zoneMatcher.py

On Tue, Feb 9, 2021 at 11:55 AM Giovane Moura notifications@github.com wrote:

Shall I close this issue?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SIDN/CycleHunter/issues/3#issuecomment-775853300, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6FMIT7UP7STDRVDFQBO6TS6EIBXANCNFSM4XIPKCMA .

SvenVD-be commented 3 years ago

thanks! i think you're good, you don't seem to have cyclic dependencies 👍

Notice that YYYcom.2021-02-08.step4.json is only written IF there are cyclic dependencies

does that make sense to you? (I'll improve the error log message, and fix the bug)

You can try manually validate that, by sampling some of the step3.json domains and see if that's the case

For the record

I also get the XXZONE..2021-02-10.step4.json does not exist; exiting

In my case XXZONE.2021-02-10.step3.json has empty values {"partialDep": {}, "fullDep": {}, "fullDepWithInzone": {}}

However step 2 does contain data, so I guess those just timed out for other reasons then cyclic dependencies.

gmmoura commented 3 years ago

yeah, exactly.

gmmoura commented 3 years ago

( I am closing unless you see an issue()