torproject / stem

Python controller library for Tor
https://stem.torproject.org/
GNU Lesser General Public License v3.0
257 stars 75 forks source link

stem.directory.Fallback.from_remote() is not working anymore #136

Closed PascalinDe closed 10 months ago

PascalinDe commented 11 months ago
Python 3.11.4 (main, Jun  7 2023, 10:13:09) [GCC 12.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.5.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import stem.directory

In [3]: print(stem.directory.Fallback.from_remote())
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In [3], line 1
----> 1 print(stem.directory.Fallback.from_remote())

File ~/src/stem/stem/directory.py:421, in Fallback.from_remote(timeout)
    419     header[mapping.group(1)] = mapping.group(2)
    420   else:
--> 421     raise OSError('Malformed fallback directory header line: %s' % line)
    423 Fallback._pop_section(lines)  # skip human readable comments
    425 # Entries look like...
    426 #
    427 # "5.9.110.236:9030 orport=9001 id=0756B7CD4DFC8182BE23143FAC0642F515182CEB"
    428 # " ipv6=[2a01:4f8:162:51e2::2]:9001"
    429 # /* nickname=rueckgrat */
    430 # /* extrainfo=1 */

OSError: Malformed fallback directory header line: //
PascalinDe commented 11 months ago

the problem seems to be the assumption on the ordering of the lines in the file

the beginning of the file looks like this, with ln. 5-7 being problematic (two comments and one empty line that will not match FALLBACK_MAPPING)

/* type=fallback */
/* version=4.0.0 */
/* timestamp=20210412000000 */
/* source=offer-list */
//
// Generated on: Wed, 12 Jul 2023 12:29:56 +0000

"95.216.146.117 orport=9001 id=72270EB58EDEBE727AA29E67417628DBCE889FAE"
" ipv6=[2a01:4f9:c010:a48::1]:9001"
/* nickname=giesskanne */
/* extrainfo=0 */
/* ===== */

Fallback._pop_section() does not put the lines 5-7 in their own section, or they'd be taken care off by line 423

Fallback._pop_section(lines)  # skip human readable comments

that just skips an entire section on the presumption that it contains the comments

so the fix for this problem would be something like this:

diff --git a/stem/directory.py b/stem/directory.py
index c4bde1c4..4d7c64a6 100644
--- a/stem/directory.py
+++ b/stem/directory.py
@@ -60,7 +60,7 @@ AUTHORITY_V3IDENT = re.compile('"v3ident=([\\dA-F]{40}) "')
 AUTHORITY_IPV6 = re.compile('"ipv6=\\[([\\da-f:]+)\\]:(\\d+) "')
 AUTHORITY_ADDR = re.compile('"([\\d\\.]+):(\\d+) ([\\dA-F ]{40,49})",')

-FALLBACK_DIV = '/* ===== */'
+FALLBACK_DIV = ('/* ===== */', '')  # 1st section is separated by empty line subsequent sections by '/* ===== */'
 FALLBACK_MAPPING = re.compile('/\\*\\s+(\\S+)=(\\S*)\\s+\\*/')

 FALLBACK_ADDR = re.compile('"([\\d\\.]+):(\\d+) orport=(\\d+) id=([\\dA-F]{40}).*')
@@ -420,8 +420,6 @@ class Fallback(Directory):
       else:
         raise OSError('Malformed fallback directory header line: %s' % line)

-    Fallback._pop_section(lines)  # skip human readable comments
-
     # Entries look like...
     #
     # "5.9.110.236:9030 orport=9001 id=0756B7CD4DFC8182BE23143FAC0642F515182CEB"
@@ -462,8 +460,8 @@ class Fallback(Directory):
     if lines:
       line = lines.pop(0)

-      while lines and line != FALLBACK_DIV:
-        if line.strip() != ',':
+      while lines and line not in FALLBACK_DIV:
+        if line.strip() != ',' and not line.startswith('//'):   # ignore comments
           section_lines.append(line)

         line = lines.pop(0)

but then it fails further down the line:

In [2]: print(stem.directory.Fallback.from_remote())
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/src/stem/stem/directory.py:433, in Fallback.from_remote(timeout)
    431 results = {}
--> 433 for matches in _directory_entries(lines, Fallback._pop_section, (FALLBACK_ADDR, FALLBACK_NICKNAME, FALLBACK_EXTRAINFO, FALLBACK_IPV6), required = (FALLBACK_ADDR,)):
    434   address, dir_port, or_port, fingerprint = matches[FALLBACK_ADDR]  # type: ignore

File ~/src/stem/stem/directory.py:109, in _directory_entries(lines, pop_section_func, regexes, required)
    108 while next_section:
--> 109   yield _match_with(next_section, regexes, required)
    110   next_section = pop_section_func(lines)

File ~/src/stem/stem/directory.py:100, in _match_with(lines, regexes, required)
     99     if required_matcher not in matches:
--> 100       raise ValueError('Failed to parse mandatory data from:\n\n%s' % '\n'.join(lines))
    102 return matches

ValueError: Failed to parse mandatory data from:

"95.216.146.117 orport=9001 id=72270EB58EDEBE727AA29E67417628DBCE889FAE"
" ipv6=[2a01:4f9:c010:a48::1]:9001"
/* nickname=giesskanne */
/* extrainfo=0 */

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Cell In [2], line 1
----> 1 print(stem.directory.Fallback.from_remote())

File ~/src/stem/stem/directory.py:447, in Fallback.from_remote(timeout)
    436     results[fingerprint] = Fallback(
    437       address = address,
    438       or_port = int(or_port),
   (...)
    444       header = header,
    445     )
    446 except ValueError as exc:
--> 447   raise OSError(str(exc))
    449 return results

OSError: Failed to parse mandatory data from:

"95.216.146.117 orport=9001 id=72270EB58EDEBE727AA29E67417628DBCE889FAE"
" ipv6=[2a01:4f9:c010:a48::1]:9001"
/* nickname=giesskanne */
/* extrainfo=0 */

which is because FALLBACK_ADDR requires the port to be part of the IPv4 address but as far as I can see in the current file this seems no longer be the standard, so I made it optional, but I wasn't sure what to do if dir_port is not found, should it be the same value as or_port?

diff --git a/stem/directory.py b/stem/directory.py
index c4bde1c4..668c860c 100644
--- a/stem/directory.py
+++ b/stem/directory.py
@@ -60,10 +60,10 @@ AUTHORITY_V3IDENT = re.compile('"v3ident=([\\dA-F]{40}) "')
 FALLBACK_MAPPING = re.compile('/\\*\\s+(\\S+)=(\\S*)\\s+\\*/')

-FALLBACK_ADDR = re.compile('"([\\d\\.]+):(\\d+) orport=(\\d+) id=([\\dA-F]{40}).*')
+FALLBACK_ADDR = re.compile('"([\\d\\.]+)(?::(\\d+))? orport=(\\d+) id=([\\dA-F]{40}).*')
 FALLBACK_NICKNAME = re.compile('/\\* nickname=(\\S+) \\*/')
 FALLBACK_EXTRAINFO = re.compile('/\\* extrainfo=([0-1]) \\*/')
 FALLBACK_IPV6 = re.compile('" ipv6=\\[([\\da-f:]+)\\]:(\\d+)"')
@@ -420,8 +420,6 @@ class Fallback(Directory):
       else:
         raise OSError('Malformed fallback directory header line: %s' % line)

-    Fallback._pop_section(lines)  # skip human readable comments
-
     # Entries look like...
     #
     # "5.9.110.236:9030 orport=9001 id=0756B7CD4DFC8182BE23143FAC0642F515182CEB"
@@ -438,7 +436,7 @@ class Fallback(Directory):
         results[fingerprint] = Fallback(
           address = address,
           or_port = int(or_port),
-          dir_port = int(dir_port),
+          dir_port = int(dir_port or or_port),
           fingerprint = fingerprint,
           nickname = matches.get(FALLBACK_NICKNAME),  # type: ignore
           has_extrainfo = matches.get(FALLBACK_EXTRAINFO) == '1',
@@ -462,8 +460,8 @@ class Fallback(Directory):

also looking at the file it seems to me that the IPv4 port is no longer part of the host for any of the entries, should it be removed completely?

atagar commented 11 months ago

Hi Carine, this looks great. On first glance I only spotted a couple issues...

I wasn't sure what to do if dir_port is not found, should it be the same value as or_port?

Nope, if the DirPort is missing it should be None. I added an untested commit to my pr136 branch which should address it.

https://gitweb.torproject.org/user/atagar/stem.git/commit/?h=pr136

The second gotcha is that there were unit test regressions. For testing instructions see here. The output of ./run_tests.py --unit for commit 9b4a048a was...

TESTING FAILED (8 seconds)
  [UNIT TEST] test_inner_layer_creation (test.unit.descriptor.hidden_service_v3.TestHiddenServiceDescriptorV3) ... FAIL
  [UNIT TEST] test_exit_used (test.unit.examples.TestExamples) ... FAIL

... whereas with these fixes (commit d98476dc) it became...

TESTING FAILED (9 seconds)
  [UNIT TEST] test_inner_layer_creation (test.unit.descriptor.hidden_service_v3.TestHiddenServiceDescriptorV3) ... FAIL
  [UNIT TEST] test_from_remote (test.unit.directory.fallback.TestFallback) ... ERROR
  [UNIT TEST] test_from_remote_malformed (test.unit.directory.fallback.TestFallback) ... FAIL
  [UNIT TEST] test_exit_used (test.unit.examples.TestExamples) ... FAIL

Once upon a time Stem had phenomenal test coverage. Errors have begun to crop up but we shouldn't merge anything that causes further backsliding.

PascalinDe commented 11 months ago

Hi Damian, thanks for your feedback, I'll look into it!

PascalinDe commented 11 months ago

Hi Carine, this looks great. On first glance I only spotted a couple issues...

I wasn't sure what to do if dir_port is not found, should it be the same value as or_port?

Nope, if the DirPort is missing it should be None. I added an untested commit to my pr136 branch which should address it.

https://gitweb.torproject.org/user/atagar/stem.git/commit/?h=pr136

OK, I'll integrate and test it :+1:

The second gotcha is that there were unit test regressions. For testing instructions see here. The output of ./run_tests.py --unit for commit 9b4a048 was...

You're right on the new errors, I'll look into them - although I get different an additional error on the current master HEAD:

TESTING FAILED (1 seconds)
  [UNIT TEST] test_compression_lzma (test.unit.descriptor.remote.TestDescriptorDownloader) ... FAIL
  [UNIT TEST] test_exit_used (test.unit.examples.TestExamples) ... FAIL
  [UNIT TEST] test_relay_connections (test.unit.examples.TestExamples) ... FAIL
PascalinDe commented 10 months ago

Hi, just making sure you saw my last comment re Fallback._write on https://github.com/torproject/stem/pull/138? If it's alright with you as-is I can close this issue.

atagar commented 10 months ago

Hi Carine. Yup, I replied to the comment about Fallback._write. Closing.