wader / goutubedl

Go wrapper for youtube-dl and yt-dlp
https://pkg.go.dev/github.com/wader/goutubedl?tab=doc
MIT License
86 stars 26 forks source link

Problem wth reddit links #173

Closed j1cs closed 6 months ago

j1cs commented 7 months ago

The following program:

package main

import (
    "context"
    "fmt"
    "io"
    "log"
    "os"

    "github.com/wader/goutubedl"
)

func main() {
    result, err := goutubedl.New(context.Background(), "https://www.reddit.com/r/newsbabes/s/92rflI0EB0", goutubedl.Options{})
    if err != nil {
        log.Fatal(err)
    }
    downloadResult, err := result.Download(context.Background(), "best")
    if err != nil {
        log.Fatal(err)
    }
    defer downloadResult.Close()
    f, err := os.Create("output")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    w, err := io.Copy(f, downloadResult)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("written %d\n", w)
}

The result is written 0 So in the end the file is downloaded empty. With yt-dlp works fine:

$ dl "https://www.reddit.com/r/newsbabes/s/92rflI0EB0"
Extracting cookies from chrome
Extracted 842 cookies from chrome
[generic] Extracting URL: https://www.reddit.com/r/newsbabes/s/92rflI0EB0
[generic] 92rflI0EB0: Downloading webpage
[redirect] Following redirect to https://www.reddit.com/r/newsbabes/comments/1am0l3z/ana_mafud_telemundo_arizona/?share_id=qwy0V17xM17tVYObLaChX&utm_content=1&utm_medium=android_app&utm_name=androidcss&utm_source=share&utm_term=14&rdt=40937
[Reddit] Extracting URL: https://www.reddit.com/r/newsbabes/comments/1am0l3z/ana_mafud_telemundo_arizona/?share_id=qwy0V17...tm_term=14&rdt=40937
[Reddit] 1am0l3z: Downloading JSON metadata
[Reddit] 1am0l3z: Downloading m3u8 information
[Reddit] 1am0l3z: Downloading MPD manifest
[info] x0br599s8ehc1: Downloading 1 format(s): hls-1810+dash-7
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 6
[download] Destination: video_2024-03-02_230822.fhls-1810.mp4
[download] 100% of    4.51MiB in 00:00:00 at 21.22MiB/s
[download] Destination: video_2024-03-02_230822.fdash-7.m4a
[download] 100% of  358.74KiB in 00:00:00 at 17.55MiB/s
[Merger] Merging formats into "video_2024-03-02_230822.mp4"
Deleting original file video_2024-03-02_230822.fhls-1810.mp4 (pass -k to keep)
Deleting original file video_2024-03-02_230822.fdash-7.m4a (pass -k to keep)

dl command is bash function:

function dl {
    file=video_`date +'%Y-%m-%d_%H%M%S.mp4'`
    yt-dlp  --cookies-from-browser chrome -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" "$1" -o $file && touch $file
}
TheForcer commented 7 months ago

As soon as you define a filter for a download from reddit, goutube will write 0 bytes. With an empty filter string like downloadResult, err := result.Download(context.Background(), ""), the download works fine :)

wader commented 7 months ago

Hey, for some reason i didn't get any notifications for this issue.

Could you try:

package main

import (
    "context"
    "fmt"
    "io"
    "log"
    "os"
    "os/exec"

    "github.com/wader/goutubedl"
)

func main() {
    result, err := goutubedl.New(
        context.Background(),
        "https://www.reddit.com/r/newsbabes/s/92rflI0EB0",
        goutubedl.Options{
            DebugLog: log.Default(),
            StderrFn: func(cmd *exec.Cmd) io.Writer {
                return os.Stderr
            },
        },
    )
    if err != nil {
        log.Fatal(err)
    }
    downloadResult, err := result.Download(context.Background(), "best")
    if err != nil {
        log.Fatal(err)
    }
    defer downloadResult.Close()
    f, err := os.Create("output")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    w, err := io.Copy(f, downloadResult)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("written %d\n", w)
}

With that i get:

# yt-dlp --version
2023.12.30

# go run cmd/test/test.go
2024/03/07 15:18:13 cmd [/usr/local/bin/yt-dlp --ignore-errors --no-call-home --no-cache-dir --skip-download --restrict-filenames --batch-file - -J]
Reading URLs from STDIN - EOF (Ctrl+D) to end:
2024/03/07 15:18:19 cmd [/usr/local/bin/yt-dlp --no-call-home --no-cache-dir --ignore-errors --newline --restrict-filenames -o - --load-info /tmp/ydls4005491150/info.json -f best]
WARNING: "-f best" selects the best pre-merged format which is often not the best option.
         To let yt-dlp download and merge the best available formats, simply do not pass any format selection.
         If you know what you are doing and want only the best pre-merged format, use "-f b" instead to suppress this warning
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/bin/yt-dlp/__main__.py", line 17, in <module>
  File "/usr/local/bin/yt-dlp/yt_dlp/__init__.py", line 1009, in main
  File "/usr/local/bin/yt-dlp/yt_dlp/__init__.py", line 997, in _real_main
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 3555, in download_with_info_file
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 3516, in wrapper
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1802, in process_ie_result
  File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 2914, in process_video_result
yt_dlp.utils.ExtractorError: [Reddit] x0br599s8ehc1: Requested format is not available. Use --list-formats for a list of available formats
written 0

I can replicate it with yt-dlp -f best https://www.reddit.com/r/newsbabes/s/92rflI0EB0 but for some reasons the exception seem to be handled better. If you run with -F you will see that there are not pre-merged (audio and video muxed into one file) formats available which is required by "best".

If you leave filter empty it will by default be something like bestvideo*+bestaudio/best (see https://github.com/yt-dlp/yt-dlp?tab=readme-ov-file#format-selection).

Did some testing and i wonder if there is a bug in yt-dlp that makes it not handle the exception properly (goutubedl looks for ERROR: to know that something went wrong) when using --load-info. You can replicate with yt-dlp -j ... > info.json then use yt-dlp -f best --load-info info.json ...

j1cs commented 7 months ago

i got the same as you:

$  yt-dlp --version
2023.12.30
$ go run main.go 
2024/03/07 22:22:21 cmd [/bin/yt-dlp --ignore-errors --no-call-home --no-cache-dir --skip-download --restrict-filenames --batch-file - -J]
Reading URLs from STDIN - EOF (Ctrl+D) to end:
2024/03/07 22:22:22 cmd [/bin/yt-dlp --no-call-home --no-cache-dir --ignore-errors --newline --restrict-filenames -o - --load-info /tmp/ydls3636320213/info.json -f best]
WARNING: "-f best" selects the best pre-merged format which is often not the best option.
         To let yt-dlp download and merge the best available formats, simply do not pass any format selection.
         If you know what you are doing and want only the best pre-merged format, use "-f b" instead to suppress this warning
Traceback (most recent call last):
  File "/bin/yt-dlp", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/__init__.py", line 1009, in main
    _exit(*variadic(_real_main(argv)))
                    ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/__init__.py", line 997, in _real_main
    return ydl.download_with_info_file(expand_path(opts.load_info_filename))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 3555, in download_with_info_file
    self.__download_wrapper(self.process_ie_result)(info, download=True)
  File "/usr/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 3516, in wrapper
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 1802, in process_ie_result
    ie_result = self.process_video_result(ie_result, download=download)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 2914, in process_video_result
    raise ExtractorError(
yt_dlp.utils.ExtractorError: [Reddit] x0br599s8ehc1: Requested format is not available. Use --list-formats for a list of available formats
written 0

i left blank the filter option and works:

package main

import (
    "context"
    "fmt"
    "io"
    "log"
    "os"
    "os/exec"

    "github.com/wader/goutubedl"
)

func main() {
    result, err := goutubedl.New(
        context.Background(),
        "https://www.reddit.com/r/newsbabes/s/92rflI0EB0",
        goutubedl.Options{
            DebugLog: log.Default(),
            StderrFn: func(cmd *exec.Cmd) io.Writer {
                return os.Stderr
            },
        },
    )
    if err != nil {
        log.Fatal(err)
    }
    downloadResult, err := result.Download(context.Background(), "")
    if err != nil {
        log.Fatal(err)
    }
    defer downloadResult.Close()
    f, err := os.Create("output.mp4")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    w, err := io.Copy(f, downloadResult)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("written %d\n", w)
}
$ go run main.go 
2024/03/07 22:30:02 cmd [/bin/yt-dlp --ignore-errors --no-call-home --no-cache-dir --skip-download --restrict-filenames --batch-file - -J]
Reading URLs from STDIN - EOF (Ctrl+D) to end:
2024/03/07 22:30:04 cmd [/bin/yt-dlp --no-call-home --no-cache-dir --ignore-errors --newline --restrict-filenames -o - --load-info /tmp/ydls1198810216/info.json]
[info] x0br599s8ehc1: Downloading 1 format(s): hls-1810+dash-7
[download] Destination: -
mime type is not rfc8216 compliant
[hls @ 0x575890dff2c0] Skip ('#EXT-X-VERSION:4')
[hls @ 0x575890dff2c0] Opening 'https://v.redd.it/x0br599s8ehc1/HLS_720.ts' for reading
    Last message repeated 1 times
Input #0, hls, from 'https://v.redd.it/x0br599s8ehc1/HLS_720.m3u8':
  Duration: 00:00:22.53, start: 0.177778, bitrate: 0 kb/s
  Program 0 
    Metadata:
      variant_bitrate : 0
  Stream #0:0: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p(tv, bt709), 720x1280, 30 fps, 30 tbr, 90k tbn
    Metadata:
      variant_bitrate : 0
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'https://v.redd.it/x0br599s8ehc1/DASH_AUDIO_128.mp4':
  Metadata:
    major_brand     : mp41
    minor_version   : 0
    compatible_brands: iso8isommp41dashcmfc
    creation_time   : 2024-02-08T17:08:39.000000Z
  Duration: 00:00:22.50, start: 0.093000, bitrate: 130 kb/s
  Stream #1:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 22 kb/s (default)
    Metadata:
      creation_time   : 2024-02-08T17:08:39.000000Z
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
Output #0, mpegts, to 'pipe:':
  Metadata:
    encoder         : Lavf60.16.100
  Stream #0:0: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p(tv, bt709), 720x1280, q=2-31, 30 fps, 30 tbr, 90k tbn
    Metadata:
      variant_bitrate : 0
  Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 22 kb/s (default)
    Metadata:
      creation_time   : 2024-02-08T17:08:39.000000Z
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #1:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
[https @ 0x575890e9d340] Opening 'https://v.redd.it/x0br599s8ehc1/HLS_720.ts' for reading
[https @ 0x575891495dc0] Opening 'https://v.redd.it/x0br599s8ehc1/HLS_720.ts' for reading
[https @ 0x575890e9d340] Opening 'https://v.redd.it/x0br599s8ehc1/HLS_720.ts' for reading
[https @ 0x575891495dc0] Opening 'https://v.redd.it/x0br599s8ehc1/HLS_720.ts' for reading
[out#0/mpegts @ 0x5758914d24c0] video:4441kB audio:353kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 6.125208%
size=    5087kB time=00:00:22.43 bitrate=1857.6kbits/s speed= 252x    
[download] 100% in 00:00:00
written 5208916
wader commented 7 months ago

Thanks for testing! i've filed an yt-dlp issue https://github.com/yt-dlp/yt-dlp/issues/9388

wader commented 7 months ago

So once that is fixed in this case when using "best" it would still not work but you would get a proper error at least

j1cs commented 7 months ago

great! thanks for your work!

wader commented 7 months ago

yt-dlp issue has been fixed. Haven't tried with goutubedl but seems to work:

$ python3.12 -m yt_dlp -j https://www.reddit.com/r/newsbabes/s/92rflI0EB0 > info.json
$ python3.12 -m yt_dlp --load-info-json info.json -f best
WARNING: "-f best" selects the best pre-merged format which is often not the best option.
         To let yt-dlp download and merge the best available formats, simply do not pass any format selection.
         If you know what you are doing and want only the best pre-merged format, use "-f b" instead to suppress this warning
ERROR: [Reddit] x0br599s8ehc1: Requested format is not available. Use --list-formats for a list of available formats

Close once there is a new yt-dlp release?

wader commented 7 months ago

https://github.com/yt-dlp/yt-dlp/releases/tag/2024.03.10 has been released with a fix

But i noticed now that there is no code for looking for a error line for the download method and it's a bit more tricky in that case as it returns before the yt-dlp process exited. Will have to think about it.

wader commented 6 months ago

With latest yt-dlp and #178 we should now get proper error