rndusr / torf

Python module to create, parse and edit torrent files and magnet links
GNU General Public License v3.0
180 stars 17 forks source link

validate() should not check files #46

Open dechamps opened 2 months ago

dechamps commented 2 months ago

I find the following… odd (torf 4.2.7):

import os
import torf

with open("test.txt", "w") as file:
    file.write("test")
torrent = torf.Torrent(path="test.txt")
torrent.generate()
os.unlink("test.txt")
print(torrent.infohash)
Traceback (most recent call last):
  File ".venv/lib/python3.12/site-packages/torf/_torrent.py", line 1032, in infohash
    return self._infohash
           ^^^^^^^^^^^^^^
AttributeError: 'Torrent' object has no attribute '_infohash'. Did you mean: 'infohash'?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "torfpath.py", line 9, in <module>
    print(torrent.infohash)
          ^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/torf/_torrent.py", line 1034, in infohash
    raise e
  File ".venv/lib/python3.12/site-packages/torf/_torrent.py", line 1021, in infohash
    self.validate()
  File ".venv/lib/python3.12/site-packages/torf/_torrent.py", line 1413, in validate
    raise error.MetainfoError(f"Metainfo includes {self.path} as file, but it is not a file")
torf._errors.MetainfoError: Invalid metainfo: Metainfo includes test.txt as file, but it is not a file

Especially since the following works:

import os
import torf

with open("test.txt", "w") as file:
    file.write("test")
torrent = torf.Torrent(path="test.txt")
torrent.generate()
torrent.write("test.torrent")

os.unlink("test.txt")

torrent = torf.Torrent.read("test.torrent")
print(torrent.infohash)

As it turns out there are a number of problems here:

  1. The documentation of validate() does not mention anything about checking files.
    • The docs only state "Check if all mandatory keys exist in metainfo and all standard keys have correct types"
  2. I don't think it makes sense for validate() to do file I/O - that's surprising and inefficient.
    • Folding this logic into verify() would make much more sense (e.g. "verify lite" which only checks file existence but not the contents)
  3. This behavior only kicks in if the torrent was originally generated from a path.
    • This means some Torrent objects will check files in validate(), others won't. That's very confusing.
  4. The Torrent.infohash get accessor calls validate(), which means simply writing the expression torrent.infohash is enough to trigger file I/O and fail on missing files, which is ridiculous.

A workaround is to set torrent._path = None immediately after torrent.generate() to make the object forget about the files and make it behave like a torrent read from a torrent file. (Note that torrent.path = None won't quite work because it clears pieces internally which then makes validate() fail. I suspect that's another bug.)

rndusr commented 2 months ago

Yes, unfortunately, torf needs to get a lot of its smartness ripped out and replaced with a healthy chunk of dumbness. But that's essentially a rewrite.

This particular issue might get fixed like this:

@property
def path(self):
    if self._path and os.path.exists(self._path):
        return self._path

But I'm not sure if that creates any other issues. I don't think it should, but maybe torf disagrees.