iawia002 / lux

👾 Fast and simple video download library and CLI tool written in Go
MIT License
27.09k stars 2.94k forks source link

iqiyi视频网址全面更新 无法下载 #876

Open qbabe opened 3 years ago

qbabe commented 3 years ago

原先旧版网址是 http://tw.iqiyi.com/v_23xpzvuuqn0.html 新版网址改成 https://www.iq.com/play/23xpzvuuqn0 就再也无法解析与下载了

现在新出的视频(已经无旧版网址) 就算自行转换成旧版网址 在网页版是能顺利播 但用annie却解析失效无法下载 只会呈现 Downloading http://tw.iqiyi.com/v_23xpzvuuqn0.html error:json: cannot unmarshal string into Go struct field .data.vp.tkl of type []struct { Vs []struct { Bid int "json:\"bid\""; Scrsz string "json:\"scrsz\""; Vsize int64 "json:\"vsize\""; Fs []struct { L string "json:\"l\""; B int64 "json:\"b\"" } "json:\"fs\"" } "json:\"vs\"" }

麻烦大神能否帮忙协助??

ghost commented 3 years ago

試試看 https://github.com/bobosanm/Test

qbabe commented 3 years ago

@bobosanm 用iqGrap.exe下載的分段f4v似乎只能從頭播放到完,沒辦法快轉或倒退, 假如想要快轉或倒退該段f4v就只會一直退回00:00:00處(從頭開始),是根本完全沒辦法快轉或倒退的。 而且下載下來的各分段f4v完全沒有字幕,但卻強制都印有iqiyi浮水印logo, 若是韓劇沒有字幕就無法聽懂在演什麼了。 下載的f4v好像也沒辦法滿速下載,相當相當耗時,各段的f4v好像沒辦法自行合併。 請問可否參考原本annie(參考)的算法↓↓ https://github.com/zhangn1985/ykdl/commit/0bfd0a0cece6f982df641337325862175602cae4#diff-5f7f459abf17b98d5275a80eb3baa0ae 此算法下載下來的f4v確定都有字幕且不會有浮水印,各段f4v也都能正常播放、快轉或倒退的,而且最後可以自行合併。 只是目前annie不支援https://www.iq.com網址

ghost commented 3 years ago

我把annie新增了iq.com的支援,其餘部分未更動 主要是增加iq.com網頁分析tvid和vid

twu2 commented 3 years ago

試著改了一下, 應該就只有網址變 iq.com, 然後抓不到正確的 tvid (還有 title), 有正確的 tvid 後, 就能正常下載.

diff -Nur -x .git annie/extractors/extractors.go annie.patched/extractors/extractors.go
--- annie/extractors/extractors.go      2021-06-26 00:56:03.846803530 +0800
+++ annie.patched/extractors/extractors.go      2021-06-26 01:00:28.961230925 +0800
@@ -53,7 +53,7 @@
                "youku":      youku.New(),
                "youtube":    youtubeExtractor,
                "youtu":      youtubeExtractor, // youtu.be
-               "iqiyi":      iqiyi.New(),
+               "iq":         iqiyi.New(),
                "mgtv":       mgtv.New(),
                "tangdou":    tangdou.New(),
                "tumblr":     tumblr.New(),
diff -Nur -x .git annie/extractors/iqiyi/iqiyi.go annie.patched/extractors/iqiyi/iqiyi.go
--- annie/extractors/iqiyi/iqiyi.go     2021-06-26 00:56:03.866803713 +0800
+++ annie.patched/extractors/iqiyi/iqiyi.go     2021-06-26 01:13:12.092222404 +0800
@@ -39,7 +39,7 @@
        L string `json:"l"`
 }

-const iqiyiReferer = "https://www.iqiyi.com"
+const iqiyiReferer = "https://www.iq.com"

 func getMacID() string {
        var macID string
@@ -102,7 +102,9 @@

 // Extract is the main function to extract the data.
 func (e *extractor) Extract(url string, _ types.Options) ([]*types.Data, error) {
-       html, err := request.Get(url, iqiyiReferer, nil)
+       html, err := request.Get(url, iqiyiReferer, map[string]string{
+               "Accept-Language":   "zh-TW",
+       })
        if err != nil {
                return nil, err
        }
@@ -117,6 +119,7 @@
                        `data-player-tvid="([^"]+)"`,
                        `param\['tvid'\]\s*=\s*"(.+?)"`,
                        `"tvid":"(\d+)"`,
+                       `"tvId":(\d+)`,
                )
        }
        if tvid == nil || len(tvid) < 2 {
@@ -144,15 +147,15 @@
        if err != nil {
                return nil, err
        }
-       title := strings.TrimSpace(doc.Find("h1>a").First().Text())
-       var sub string
-       for _, k := range []string{"span", "em"} {
-               if sub != "" {
-                       break
-               }
-               sub = strings.TrimSpace(doc.Find("h1>" + k).First().Text())
-       }
-       title += sub
+       title := strings.TrimSpace(doc.Find("span#pageMetaTitle").First().Text())
+       sub := utils.MatchOneOf(
+                       html,
+                       `"subTitle":"([^"]+)","isoDuration":`,
+               )
+       if sub != nil || len(sub) > 1 {
+               title += " "
+               title += sub[1]
+        }
        if title == "" {
                title = doc.Find("title").Text()
        }
qbabe commented 3 years ago

試著改了一下, 應該就只有網址變 iq.com, 然後抓不到正確的 tvid (還有 title), 有正確的 tvid 後, 就能正常下載.

diff -Nur -x .git annie/extractors/extractors.go annie.patched/extractors/extractors.go
--- annie/extractors/extractors.go      2021-06-26 00:56:03.846803530 +0800
+++ annie.patched/extractors/extractors.go      2021-06-26 01:00:28.961230925 +0800
@@ -53,7 +53,7 @@
                "youku":      youku.New(),
                "youtube":    youtubeExtractor,
                "youtu":      youtubeExtractor, // youtu.be
-               "iqiyi":      iqiyi.New(),
+               "iq":         iqiyi.New(),
                "mgtv":       mgtv.New(),
                "tangdou":    tangdou.New(),
                "tumblr":     tumblr.New(),
diff -Nur -x .git annie/extractors/iqiyi/iqiyi.go annie.patched/extractors/iqiyi/iqiyi.go
--- annie/extractors/iqiyi/iqiyi.go     2021-06-26 00:56:03.866803713 +0800
+++ annie.patched/extractors/iqiyi/iqiyi.go     2021-06-26 01:13:12.092222404 +0800
@@ -39,7 +39,7 @@
        L string `json:"l"`
 }

-const iqiyiReferer = "https://www.iqiyi.com"
+const iqiyiReferer = "https://www.iq.com"

 func getMacID() string {
        var macID string
@@ -102,7 +102,9 @@

 // Extract is the main function to extract the data.
 func (e *extractor) Extract(url string, _ types.Options) ([]*types.Data, error) {
-       html, err := request.Get(url, iqiyiReferer, nil)
+       html, err := request.Get(url, iqiyiReferer, map[string]string{
+               "Accept-Language":   "zh-TW",
+       })
        if err != nil {
                return nil, err
        }
@@ -117,6 +119,7 @@
                        `data-player-tvid="([^"]+)"`,
                        `param\['tvid'\]\s*=\s*"(.+?)"`,
                        `"tvid":"(\d+)"`,
+                       `"tvId":(\d+)`,
                )
        }
        if tvid == nil || len(tvid) < 2 {
@@ -144,15 +147,15 @@
        if err != nil {
                return nil, err
        }
-       title := strings.TrimSpace(doc.Find("h1>a").First().Text())
-       var sub string
-       for _, k := range []string{"span", "em"} {
-               if sub != "" {
-                       break
-               }
-               sub = strings.TrimSpace(doc.Find("h1>" + k).First().Text())
-       }
-       title += sub
+       title := strings.TrimSpace(doc.Find("span#pageMetaTitle").First().Text())
+       sub := utils.MatchOneOf(
+                       html,
+                       `"subTitle":"([^"]+)","isoDuration":`,
+               )
+       if sub != nil || len(sub) > 1 {
+               title += " "
+               title += sub[1]
+        }
        if title == "" {
                title = doc.Find("title").Text()
        }

請問您這方法該怎麼下載呢? 能不能夠再詳細的敘述一下呢? 這方法要另外裝爬蟲之類的嗎? 因為iqiyi目前都會一直跳國際版iq.com 沒辦法回到之前那個簡單的台灣版網頁了 現在用annie好像沒辦法直接抓了

twu2 commented 3 years ago
  1. 到 annie 的 github 選 Code => Download Zip, 下載後解開
  2. 照著上面的改程式碼, 然後到上面解開的 annie-master 的目錄執行 go build 就會產生執行檔.

PS. 需要有 go 的環境, 如果是 Windows, 就到 https://golang.org/doc/install 下載.

Sparh4wk commented 2 years ago

I just build annie with latest commits, but seems like iq.com is not working. Any idea what I can be doing wrong?

Everytime Im trying to download something this error occurred :

Downloading https://www.iq.com/play/21m8w3qnnl8 error:
json: cannot unmarshal string into Go struct field .data.vp.tkl of type []struct { Vs []struct { Bid int "json:\"bid\""; Scrsz string "json:\"scrsz\""; Vsize int64 "json:\"vsize\""; Fs []struct { L string "json:\"l\""; B int64 "json:\"b\"" } "json:\"fs\"" } "json:\"vs\"" }
twu2 commented 2 years ago

It's not same issue. IQiyi block some VIP video access in getVPS() method. so annie not working for them.