Open tomspencer opened 1 year ago
Hello!
Can you try to change this:
try:
logging.debug('Search Filter=%s' % searchFilter)
sc = ldap.SimplePagedResultsControl(size=100)
ldapConnection.search(searchFilter=searchFilter,
attributes=['sAMAccountName', 'pwdLastSet', 'mail', 'lastLogon'],
sizeLimit=0, searchControls = [sc], perRecordCallback=self.processRecord)
to
try:
logging.debug('Search Filter=%s' % searchFilter)
# Microsoft Active Directory set an hard limit of 1000 entries returned by any search
paged_search_control = ldapasn1.SimplePagedResultsControl(criticality=True, size=1000)
resp = ldapConnection.search(searchFilter=searchFilter,
attributes=['sAMAccountName', 'pwdLastSet', 'mail', 'lastLogon'],
searchControls=[paged_search_control], perRecordCallback=self.processRecord)
within the GetADUsers.py
file (line 192) ? This is how paged search is implemented in other impacket examples and it works.
:sunflower:
I was optimistic that was going to work, but unfortunately it exacerbated the problem. 😞
Same general behavior as above, but now the duplicates are in groups of 1,000. This resulted in an even larger file. I killed it before it finished, but it was over 400MB and over 4 million lines, and had only gotten ~60k of the ~90k unique users.
$ grep --line-num "^jdoe " console2.out
7022:jdoe j.doe@example.com 2023-06-05 09:09:57.417645 N/A
8022:jdoe j.doe@example.com 2023-06-05 09:09:57.417645 N/A
$ grep --line-num "^kburns " console2.out | head -3
4010463:kburns ken.burns@example.com 2022-12-02 05:47:35.254596 N/A
4011463:kburns ken.burns@example.com 2022-12-02 05:47:35.254596 N/A
4012463:kburns ken.burns@example.com 2022-12-02 05:47:35.254596 N/A
$ grep --line-num "^kburns " console2.out | wc -l
87
Mhh impacket ldap implementation is not really stable and not LDAP RFC strict, that's why a lot of module uses ldap3
library instead. I recommend you to switch to another tool such as pywerview (with the get-netuser
function) for example.
Full disclosure, I'm the maintainer of pywerview.
:sunflower:
I'm trying to reproduce the issue but I can't:
$ GetADUsers.py domain.local/administrator:'Password963' -all -dc-ip dc02.domain.local > out
$ wc -l out
20021 out
$ cut -f 1 -d ' ' out | sort | uniq -c | sort -rn | head
1 999959454
1 999919655
1 999672882
1 999507452
1 999491188
1 999302173
1 999273730
1 999166230
1 999120571
1 999033491
$ cut -f 1 -d ' ' out | sort | uniq -c | sort -n | head
1 1000027054
1 1000172078
1 1000526385
1 100057581
1 1000606675
1 1000636693
1 1000679411
1 1000765249
1 1000786424
1 1000812028
(for the test, I created ~20k users named after poweshell's Get-Random
)
:sunflower:
Strange. The environments i'm seeing this in are on the order of 90k, and 200k users, but not sure if that matters.
If there is anything I can do to provide more duplication/debugging info please let me know.
When running GetADUsers.py in two separate large domains (80k+ users, although I suspect 10-20k might be enough to trigger this), there are a huge number of duplicate user entries in batches of 100 (e.g. user entry 8000 is repeated at 8100, 8001 is repeated at 8101, etc.). The situation worsens the more users you have.
Configuration
impacket version: current master branch (v0.10.1.dev1+20230628.102844.eb8a3944) and v0.10.0-4 Python version: v3.10.6 and v3.11.2 Target OS: Ubuntu 22.04 and Kali 2023.2
Debug Output With Command String
./GetADUsers.py -k -no-pass '[redacted]/[redacted]@[redacted]' -all -dc-ip [redacted] -debug
The command runs fine but takes a very long time and (when directed to a file) produces a massive file full of duplicate batches of users. In an environment with 90k users, the resulting file was ~381MB and ~3.87 million lines long. When duplicate lines are removed (i.e. 'sort console.out | uniq > users.clean') the resulting file was ~9MB and ~90k lines long (as you would expect for 90k users).
Additional context
I suspect the problem lies with the ldap paging/cursor, as that paging works in batches of 100 which aligns to these duplicate batches of 100.
In one environment the first duplicated batch of 100 users was at line ~6300 (meaning line 6300 was the same 6200, 6301 as 6201, etc.), while in another it was around 8100. It doesn't seem to be entirely consistent but seems to happen in the first 10-20k, and then continues to get worse as it goes.
Once the duplications begin (that is batches start appearing 2x), later batches then start appearing 3x, then 4x, etc. One batch toward the end repeated 259 times.
Compounding duplication shown below:
This problem compounds to the point that for even larger domains it becomes almost impossible to complete/store the output.
Of particular concern here is that if this issue is in the underlying ldap.py library it may be impacting a number of other example scripts/components that rely on it as well.