Closed PascalinDe closed 1 year ago
Hi Carine. This includes an error with the dirport (dir_port = int(dir_port or or_port)
) that I attempted to fix via a patch on #135.
Hi Damian,
I just rebased as-is so that https://github.com/torproject/stem/pull/135 can be treated independently (I'm focusing on that one first). I will look into this PR around mid next week, thank your for your help!
hi, sorry for the long delay, I didn't find time to work on it sooner
so, I integrated your changes and added one regarding dir_port
(https://github.com/torproject/stem/pull/138/commits/25a2cbe083b349c3e6a6d7d1f4a9315fab75c80c#r1285996434)
I have two unit tests that have different results than on the current master
branch:
the problem seems to be that they are validating the code on a mock of the server response that does no longer correspond to the current format:
I adapted the code to the current response, hence the tests failing
so I'm wondering whether I should update the test data or whether the code should continue to support the old format?
so I'm wondering whether I should update the test data or whether the code should continue to support the old format?
Ideally there should be tests with both the new and old fallback_dirs.inc. Tor specified the format of the file here.
If I was in the driver's seat on this the next questions I'd be asking are...
All that said, if you'd care to simply swap the tests to the new format and call it a day I don't mind. As previously mentioned Stem is unmaintained so it's silly to let the perfect be the enemy of the good on this. :)
so, I followed your pointers, and what I found :
The above script does not seem to follow too closely the specs (e.g. the two comment lines at the beginning) but is rather meant to accomodate arti 's API - there seems to be some awareness of this (cf. https://gitlab.torproject.org/tpo/core/tor/-/commit/131e2d99a45bd86b8379664cccb59bfc66f80c96) but I cannot find any other commits that address this.
So my approach would be to adapt my PR to follow the output of the above script, and update the test data accordingly, and add a comment in the file that summarizes the above - what do you think?
slightly off-topic, I really want to stress that we appreciate you taking the time reviewing the PRs despite not maintaining the project!
So my approach would be to adapt my PR to follow the output of the above script, and update the test data accordingly, and add a comment in the file that summarizes the above - what do you think?
Sounds good to me.
slightly off-topic, I really want to stress that we appreciate you taking the time reviewing the PRs despite not maintaining the project!
My pleasure! You are uncommonly delightful to work with. Thank you for making the open source world better. :)
I updated the PR as discussed (updated the test example to the current format and the code to the format as produced by the script of Tor's CI), and also updated the URL to the GitLab URL
I get the same output for the unit and integration tests on this branch and the master branch, and the manual tests I did look good as well
My pleasure! You are uncommonly delightful to work with. Thank you for making the open source world better. :)
thank you, that's very kind of you to say - it's a really nice project! while that wouldn't be in the scope of my work project, I have some free time on my hands and I was thinking of working a bit more on stem
, but I wouldn't want to impose on you (it'd be probably not more than one or so PR per month), would that generally be OK for you?
+ if attr['dir_port'] != 'None':
+ dir_port = int(attr['dir_port'])
+ else:
+ dir_port = None
There's a couple issues with this...
if attr['dir_port'] is not None:
).My commit did the following instead...
self.dir_port = int(dir_port) if dir_port else None
Is there a reason not to do this?
# example of current header
Please change 'current' to the present date. Sadly that comment won't be accurate in a few years. ;)
- for line in Fallback._pop_section(lines):
+ for _ in range(4):
+ line = lines.pop(0)
Why do we want the first four lines rather than pop_section? Won't this raise an unexpected IndexError if 'lines' doesn't have four values?
+ if attr['dir_port'] != 'None': + dir_port = int(attr['dir_port']) + else: + dir_port = None
There's a couple issues with this...
1. You're comparing with the string 'None' rather than python's None value. 2. This tidbit is pedantic but python's conventions prefer the use of 'is' for None checks (ie. `if attr['dir_port'] is not None:`).
My commit did the following instead...
self.dir_port = int(dir_port) if dir_port else None
Is there a reason not to do this?
The line you're referring to is in the from_cache method that loads the configuration from the cached fallbacks. If a cached fallback has no dir_port, this value is cached as 'None' (the string). You can see this e.g. in the test_persistence
test:
test/unit/directory/fallback.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB set_trace >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/cdengler/src/stem.c4dt/stem/directory.py(364)from_cache()
-> conf = stem.util.conf.Config()
(Pdb) n
> /home/cdengler/src/stem.c4dt/stem/directory.py(365)from_cache()
-> conf.load(path)
(Pdb) n
> /home/cdengler/src/stem.c4dt/stem/directory.py(366)from_cache()
-> headers = collections.OrderedDict([(k.split('.', 1)[1], conf.get(k)) for k in conf.keys() if k.startswith('header.')])
(Pdb) conf['6D6EC2A2E2ED8BFF2D4834F8D669D82FC2A9FA8D.dir_port']
['None']
I think I misunderstood you comment
port on which directory information is available, or None if it doesn't have one
as "this should also be the content in the fallback file" so I didn't change anything there (ln. 530) - shall I change the Fallback._write method to adapt to this? e.g. not writing this key at all to the file? then the line you proposed would work
# example of current header
Please change 'current' to the present date. Sadly that comment won't be accurate in a few years. ;)
You're absolutely right, I'll change that!
- for line in Fallback._pop_section(lines): + for _ in range(4): + line = lines.pop(0)
Why do we want the first four lines rather than pop_section? Won't this raise an unexpected IndexError if 'lines' doesn't have four values?
_pop_section
would include the first entry in the header, since the header is not separated from the entries by FALLBACK_DIV
anymore. As for the IndexError, as per the comment https://gitlab.torproject.org/tpo/core/fallback-scripts/-/blob/main/src/main.rs#L14 the header is currently hard-coded, so the number of lines will probably not change. If I understand the comment correctly, it is bound to be removed? But you're right my solution is neither here nor there, it should either be hard-coded as well or more flexible to a different length (or no header at all), I will change it to the latter.
I also forgot: I noticed when I was working on this that there are no more nodes where the nickname value is empty, instead there seems to be now the convention that they have the nickname "Unnamed" - I guessed that since they changed this, "Unnamed" would be an expected value for nickname and the key would be expected to be present, so I didn't change the code to check for this value and not include nickname as a key
pushed the changes as I proposed above, no regressions on both the unit and the integration tests
I put them in their own commit so they are easier to review
Thanks Carine, this looks good to me! It's now pushed.
The line you're referring to is in the from_cache method that loads the configuration from the cached fallbacks... shall I change the Fallback._write method to adapt to this? e.g. not writing this key at all to the file? then the line you proposed would work
Oh! I see. My vote is to leave Fallback._write as-is. You're correct that it (and cache_fallback_directories.py) will need to be changed if/when new fallbacks are cached.
I don't mind leaving that as tech debt for when the script is next used. The cache was last synced in 2020 so there might be other issue too.
there are no more nodes where the nickname value is empty, instead there seems to be now the convention that they have the nickname "Unnamed"
Sounds good.
Oh! I see. My vote is to leave Fallback._write as-is. You're correct that it (and cache_fallback_directories.py) will need to be changed if/when new fallbacks are cached.
Ah, I didn't expect you to merge the branch that quickly - I changed it in the commit https://github.com/torproject/stem/pull/138/commits/6700dee526d94c291ed864b384b8e20c4270a244 to illustrate what I meant, so it's now merged, so sorry for the mix-up that really wasn't clear from my side...
formerly https://github.com/torproject/stem/pull/137
I rebased onto master so that it is independent from https://github.com/torproject/stem/pull/135