Closed xiyihan0 closed 8 months ago
A possible patch:
import json
class AppPixivAPI(BasePixivAPI):
...
def novel_text(self, novel_id: int | str, req_auth: bool = True) -> ParsedJson:
url = "%s/webview/v2/novel" % self.hosts
params = {
"id": novel_id
}
r = self.no_auth_requests_call("GET", url, params=params, req_auth=req_auth)
return {"novel_text": json.loads(re.search("novel:\s({.+}),\s+isOwnWork", r.text).groups()[0].encode())["text"]}
Thank you for the reminder. I will verify the v2 API and update a version later.
We currently have two methods to obtain the text of the novel. One is the https://www.pixiv.net/ajax/novel/{novel_id}
mentioned in #264 (no authentication required?). However, the data returned by this interface is quite extensive and not very compatible with the novel_text
.
Another option is /webview/v2/novel
, the interface currently used by the iOS app behaves as follows:
GET /webview/v2/novel?id=21630272&font=mincho&font_size=1.0em&line_height=1.8&color=%231F1F1F&background_color=%23FFFFFF&mode=horizontal&theme=light&margin_top=60px&margin_bottom=50px&viewer_version=20221031_ai HTTP/1.1
Host: app-api.pixiv.net
-------------------------------
HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
<!DOCTYPE html><html lang="zh"><head><title>pixiv</title><link rel="canonical" href="/novel/show.php?id=21630272"><meta name="robots" content="noindex"><meta http-equiv="content-type" content="text/html; charset=utf-8"><script>
Object.defineProperty(window, 'pixiv', {
value: {
viewerVersion: "20221031_ai",
isV2: true,
userLang: "zh",
novel:
{"id":"21630272","title":"\u4ffa\u306e\u4e8b\u304c\u5927\u597d\u304d\u306a\u5148\u8f29\u5f8c\u8f29\u3061\u3083\u3093\u304c\u3072\u305f\u3059\u3089\u80b2\u4e73\u3057\u3066\u8d85\u7206\u4e73\u306b\u306a\u308b\u8a71","seriesId":null,"seriesTitle":null,"seriesIsWatched":null,"userId":"32020681","coverUrl":"https:\/\/i.pximg.net\/c\/240x480_80\/novel-cover-master\/img\/2024\/02\/25\/15\/07\/26\/ci21630272_2f4d16677b64694b8fe208a141b34d91_master1200.jpg","tags":["R-18"],...}
It appears that this interface returns an HTML, which includes the novel text and formatting. Therefore, in addition to parsing the text section, I will make novel_text
API provide a raw={False|True}
parameter that returns the complete HTML formatted content.
Tested on PixivPy 3.7.0:
This method has been affected since the latest update of Pixiv APP.