clj-python / libpython-clj

Python bindings for Clojure
Eclipse Public License 2.0
1.08k stars 68 forks source link

Error using YoutubeDL #145

Closed WilhelmBerggren closed 3 years ago

WilhelmBerggren commented 3 years ago

While attempting to use YoutubeDL I encountered an issue. Minimal example:

(require '[libpython-clj.require :refer [require-python]])
(require '[libpython-clj.python :as py])

(def url "https://www.youtube.com/watch?v=BaW_jenozKc")
(def command (str "import youtube_dl\nyoutube_dl.YoutubeDL().download([\"" url "\"])"))

(py/initialize!)
(require-python 'youtube_dl)

;; Both return: TypeError: argument of type 'NoneType' is not iterable
(py/py. (youtube_dl/YoutubeDL) "download" [url])
(py/run-simple-string command)

Gist with stacktrace

James Tolton on Zulip gave the following workaround:

(require '[libpython-clj.require :refer [require-python import-python]])
(require '[libpython-clj.python :as py :refer [py..]])
(import-python)
(require-python '[youtube_dl :as ytdl :bind-ns true])
(require-python '[sys :bind-ns true])
(def stdout (python/open "/tmp/stdout" "w"))
(let [{globals :globals} (py/run-simple-string "") ]
  (def globals globals))
(py/set-item! globals "stdout" stdout)
(py/run-simple-string "import sys
sys.stdout = stdout
" :globals globals)
(py.. ytdl YoutubeDL (download (python/list ["https://www.youtube.com/watch?v=BaW_jenozKc"] )))

He said "It looks like this has something to do with the way we rebind sys.stdout or sys.stdin"

I love what this library is going for and I hope this helps. Thanks!

jjtolton commented 3 years ago

Thanks so much for the write-up!

cnuernber commented 3 years ago

I think youtubeDL is using the out stream in a unique way. We implement a minimal output stream but for instance if I add a 'mode' parameter (which is assumed to be a non-nil string in their code) then later in we get:

509, in _write_string
    write_string(s, out=out, encoding=self.params.get('encoding'))
  File "/home/chrisn/.local/lib/python3.8/site-packages/youtube_dl/utils.py", line 3180, in write_string
    out.buffer.write(byt)
AttributeError: 'NoneType' object has no attribute 'write'

So later one they reach into the output stream and find a buffer object and write directly to that as opposed to using the output stream's write method. In our case we have no buffer.

It could be that instead of over-writing the entire output stream for stdout and friends we could just implement a 'buffer' type but I did not see much documentation about the internals of how these things work and youtube_dl finds it necessary to sit tightly bind with a specific implementation of the output stream in order to download data from the internet.

So James' fix is reasonable and there is also an initialization parameter to disable the io redirection:

https://clj-python.github.io/libpython-clj/libpython-clj.python.html#var-initialize.21

In your code you would need to call initialize before requiring (require '[libpython-clj.require :refer [require-python import-python]]).

cnuernber commented 3 years ago

This issue has now changed a bit :-). We still default to redirecting IO but we no longer need to redirect io in order to detect errors so (py/initialize! {:no-io-redirect? true}) will sidestep the issue.

In addition, we support static namespace generation and these namespace lazily-load the required python variables thus they do not auto-intialize python like require-python has to.

We have to redirect-io in order to provide a great experience to new users but with the two additions above I think it is safe to close this issue as 'didthebestwecould' :-).