Closed obilodeau closed 9 months ago
Our choices are:
random
instead of something better<first_name><100000-999999>
is not big enough Created a test script:
import random
import names
import namesgenerator
newrng = random.SystemRandom()
iterations = 1_000_000
old_way = list()
new_namelib = list()
new_random = list()
new_len = list()
new_combined = list()
new_way = list()
for _c in range(iterations):
session_id = f"{names.get_first_name()}{random.randrange(100000, 999999)}"
old_way.append(session_id)
session_id = f"{namesgenerator.get_random_name()}{random.randrange(100000, 999999)}"
new_namelib.append(session_id)
session_id = f"{names.get_first_name()}{newrng.randrange(100000, 999999)}"
new_random.append(session_id)
session_id = f"{names.get_first_name()}{random.randrange(1000000, 9999999)}"
new_len.append(session_id)
session_id = f"{namesgenerator.get_random_name()}{newrng.randrange(1000000, 9999999)}"
new_combined.append(session_id)
session_id = f"{namesgenerator.get_random_name()}{random.randrange(1000000, 9999999)}"
new_way.append(session_id)
print(".", end="", flush=True) if _c % 10_000 == 0 else 0
print(f"\nGenerated names: {len(old_way)}")
print("\nResults of non duplicates remaining:")
print(f"Old way : {len(set(old_way))}")
print(f"New namelib : {len(set(new_namelib))}")
print(f"New random : {len(set(new_random))}")
print(f"New digit length (+1) : {len(set(new_len))}")
print(f"All Combined : {len(set(new_combined))}")
print(f"Combined namelib / +1 : {len(set(new_way))}")
Ran tests in the cloud and locally.
Results:
$ python random_names_check.py
....................................................................................................
Generated names: 1000000
Results of non duplicates remaining:
Old way : 998005
New namelib : 999960
New random : 998005
New digit length (+1) : 999799
All Combined : 999998
Combined namelib / +1 : 999997
This impacts our log hunting capabilities.
Assuming random number generators behave normally, we should not have many name duplications given we use random name (1 out of 5494) and a 100000 to 999999 random id.
However, we see hundreds of thousands of duplicates in our logs. Yes, we have millions of sessions but still it's too much.
SessionIDs are generated in
pyrdp.core.mitm
like this:The
names
module seems to have dubious crypto as is challenged here: https://github.com/treyhunner/names/issues/18#issuecomment-272858252With 2.0 on the horizon, it's time to re-evaluate how we generate session IDs.