preprocess是否可以传入正在处理的真实url - Githubissues

cdhigh / KindleEar

Aggregates RSS and web content(Calibre recipe), sends to Kindle, and includes an e-ink optimized online reader.

http://cdhigh.github.io/KindleEar/

MIT License

2.73k stars 630 forks source link

preprocess是否可以传入正在处理的真实url #638

Closed hjianhao closed 3 years ago

hjianhao commented 3 years ago

有些网站可能要做一些特殊处理，希望能从网址了解这是哪个网站

目前只能从header里面获取字段来检查（例如site-name），有些网站的header里面可能不包含明确的信息。

hjianhao commented 3 years ago

而且不同的网站参数不一样要扫描多次，也降低了效率。从url获取是哪个网站比较简单。

cdhigh commented 3 years ago

这个回调（钩子）都针对特定 “书籍”，地址应该都是知道的，如果你放了不同地址的链接在一起，并不是一个好的工程实践，可以考虑分成几个文件

hjianhao commented 3 years ago

对特定网站确实没有这个要求，主要是最近做的开发者头条，这个网站的文章发布者可以引用多个网站的内容（一半以上是微信公众号文章），我需要针对不同的网站设置内容保留哪些tag（keep_only_tags）。

这种没法分多个文件。当然能有最好，例如提供个preprocessEx什么的，没有的话只好从content内容解析出来再找特征了。

发送自 Windows 10 版邮件https://go.microsoft.com/fwlink/?LinkId=550986应用

发件人: cdhighmailto:notifications@github.com 发送时间: 2021年1月29日 18:55 收件人: cdhigh/KindleEarmailto:KindleEar@noreply.github.com 抄送: Jason Huangmailto:hjianhao@hotmail.com; Authormailto:author@noreply.github.com 主题: Re: [cdhigh/KindleEar] preprocess是否可以传入正在处理的真实url (#638)

这个回调（钩子）都针对特定 “书籍”，地址应该都是知道的，如果你放了不同地址的链接在一起，并不是一个好的工程实践，可以考虑分成几个文件

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcdhigh%2FKindleEar%2Fissues%2F638%23issuecomment-769733388&data=04%7C01%7C%7C43c77914b7a947cbbd0e08d8c444675f%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637475145357991856%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SrQgOram4lpP3nWjfZxDSC7%2BMjwUOKb3PpfuYXsqHvc%3D&reserved=0, or unsubscribehttps://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABORX52G5FTOMJQBTPY25Y3S4KH2LANCNFSM4WYEFUPA&data=04%7C01%7C%7C43c77914b7a947cbbd0e08d8c444675f%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637475145357991856%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=MgB1sOD4sfVHl1o00iyfGNgKAJcZxSWKGsIYxWaLmaY%3D&reserved=0.