Colin-XKL / RSSmanX

RSSman X 一套综合RSS解决方案
https://github.com/Colin-XKL/RSSmanX
GNU General Public License v3.0
117 stars 10 forks source link

关于ultimate clash规则的问题 #20

Closed MuXia-0326 closed 7 months ago

MuXia-0326 commented 8 months ago

我现在的需求是需要使用rsshub上的v2ex和pixiv相关的rss源

但是从日志上来看,rsshub匹配到局域网规则,然后走的直连,导致pixiv和v2ex连接不上

不知有解决方案没 orz

clash日志

time="2024-01-11T15:00:33+08:00" level=info msg="[TCP] 172.18.0.7:43138 --> rsshub:80 match IPCIDR(172.16.0.0/12) using DIRECT"
time="2024-01-11T15:03:55+08:00" level=info msg="[TCP] 172.18.0.7:35512 --> rsshub:80 match IPCIDR(172.16.0.0/12) using DIRECT"

rsshub日志

error: Request https://www.v2ex.com/api/topics/latest.json fail, retry attempt #2: RequestError: Client network socket disconnected before secure TLS connection was established
info: /pixiv/ranking/week, user IP: ::ffff:172.18.0.5
error: Pixiv refresh token failed.
error: undefined
error: Error in /ranking/week: pixiv not login
Colin-XKL commented 8 months ago

你看的那部分clash日志,意思是ttrss通过clash访问rsshub,由于两个在同一个网络下,所以走的直连,这个是预期内的。你想要的是rsshub连接v2ex这些外部站点的时候走clash的是吧,这个需要先确保给rsshub指定了proxy环境变量,默认是有配置的,可以检查下是否正常。

此外,底下贴的pixiv不能访问的日志,这里日志内容是说让你登录,不登陆无法从pixiv获取内容,你可以参考下rsshub的相关文档配置好token再试下

MuXia-0326 commented 8 months ago

你看的那部分clash日志,意思是ttrss通过clash访问rsshub,由于两个在同一个网络下,所以走的直连,这个是预期内的。你想要的是rsshub连接v2ex这些外部站点的时候走clash的是吧,这个需要先确保给rsshub指定了proxy环境变量,默认是有配置的,可以检查下是否正常。

此外,底下贴的pixiv不能访问的日志,这里日志内容是说让你登录,不登陆无法从pixiv获取内容,你可以参考下rsshub的相关文档配置好token再试下

pixiv的token是有配置的,我看docker-compose.yml中也是有给rsshub指定proxy的,所以很纳闷,为啥连接不上

config.yaml

# clash config file
# version 1.8
# last update 2022-11-06

# clash config wiki
# https://github.com/Dreamacro/clash/wiki/Configuration

# Port of HTTP(S) proxy server on the local end
port: 8080

# Port of SOCKS5 proxy server on the local end
socks-port: 8085

# Transparent proxy server port for Linux and macOS (Redirect TCP and TProxy UDP)
# redir-port: 7892

# Transparent proxy server port for Linux (TProxy TCP and TProxy UDP)
# tproxy-port: 7893

# HTTP(S) and SOCKS5 server on the same port
# mixed-port: 7890

allow-lan: true
bind-address: '*'
mode: rule

# info / warning / error / debug / silent
log-level: info

ipv6: false

profile:
    store-selected: false

dns:
    enable: false

# proxies:

proxy-providers:
    Subscription1:
        type: http
        url: '自己的节点'
        interval: 3600
        path: ./server/Subscription1.yaml
        health-check:
            enable: true
            url: https://cp.cloudflare.com/generate_204
            interval: 600

    Subscription2:
        type: http
        url: '自己的节点'
        interval: 3600
        path: ./server/Subscription2.yaml
        health-check:
            enable: true
            url: https://cp.cloudflare.com/generate_204
            interval: 600

proxy-groups:
    # url-test select which proxy will be used by benchmarking speed to a URL.
    # fallback selects an available policy by priority. The availability is tested by accessing an URL, just like an auto url-test group.
    - name: 'fq'
      type: fallback
      use:
          - Subscription1
          - Subscription2
      # lazy: true
      url: 'https://www.google.com/robots.txt'
      interval: 100

    - name: 'global-random'
      type: load-balance
      use:
          - Subscription1
          - Subscription2
      proxies:
          - DIRECT
      url: 'http://www.gstatic.com/generate_204'
      interval: 300
      # use round robin to request the url via random proxies
      # or consistent-hashing will request same url via same proxy
      strategy: round-robin

rules:
    # bypass local
    - IP-CIDR,127.0.0.0/8,DIRECT
    - IP-CIDR,172.16.0.0/12,DIRECT
    - IP-CIDR,10.0.0.0/8,DIRECT
    - IP-CIDR,192.168.0.0/16,DIRECT

    # hot blocked site
    - DOMAIN-KEYWORD,google,fq
    - DOMAIN-KEYWORD,wikipedia,fq
    - DOMAIN-KEYWORD,facebook,fq
    - DOMAIN-KEYWORD,intgram,fq
    - DOMAIN-KEYWORD,telegram,fq
    - DOMAIN-KEYWORD,pixiv,fq
    - DOMAIN-KEYWORD,pximg,fq
    - DOMAIN-KEYWORD,v2ex,fq
    - DOMAIN,rsshub.app,fq

    # anti cloudflare
    - IP-CIDR,103.21.244.0/22,global-random
    - IP-CIDR,103.22.200.0/22,global-random
    - IP-CIDR,103.31.4.0/22,global-random
    - IP-CIDR,104.16.0.0/13,global-random
    - IP-CIDR,104.24.0.0/14,global-random
    - IP-CIDR,108.162.192.0/18,global-random
    - IP-CIDR,131.0.72.0/22,global-random
    - IP-CIDR,141.101.64.0/18,global-random
    - IP-CIDR,162.158.0.0/15,global-random
    - IP-CIDR,172.64.0.0/13,global-random
    - IP-CIDR,173.245.48.0/20,global-random
    - IP-CIDR,188.114.96.0/20,global-random
    - IP-CIDR,190.93.240.0/20,global-random
    - IP-CIDR,197.234.240.0/22,global-random
    - IP-CIDR,198.41.128.0/17,global-random

    # anti anti crawler
    - DOMAIN-KEYWORD,wechat,global-random
    - DOMAIN-KEYWORD,wexin,global-random
    - DOMAIN-KEYWORD,douban,global-random
    - DOMAIN,wemp.app,global-random
    - DOMAIN,ershicimi.com,global-random
    - DOMAIN,segmentfault.com,global-random

    # default
    - GEOIP,CN,DIRECT
    - MATCH,global-random

docker-compose.yaml rsshub部分

service.rsshub:
    image: diygod/rsshub
    container_name: rsshub
    depends_on:
      - service.redis
      - service.clash
    restart: on-failure:5
    # ports:
    #     - '1200:1200'
    environment:
      PORT: 80
      # miscellaneous settings
      NODE_ENV: production
      CACHE_TYPE: redis
      REDIS_URL: redis://service.redis:6379/
      PUPPETEER_WS_ENDPOINT: ws://service.browserless:3000
      # use proxy
      PROXY_URI: socks5://clash:8085
      REQUEST_RETRY: 5
      PROXY_STRATEGY: on_retry
    env_file:
      - rsshub.env
    expose:
      - 80
    networks:
      - net_public
      - net_private
    dns:
      - 223.5.5.5
      - 8.8.8.8
    volumes:
      - *use-local-time-zone
      - *use-local-time
    labels:
      - com.centurylinklabs.watchtower.enable=true # keep the rsshub up to date
      - one.colinx.rssmanx.description='RSShub makes everything rssable'

rsshub.env

PIXIV_REFRESHTOKEN=xxxxxxxxxx
PIXIV_BYPASS_CDN=true
PIXIV_BYPASS_DOH=https://doh.pub/dns-query
PIXIV_IMG_PROXY=https://i.pixiv.cat
Colin-XKL commented 8 months ago

先看下除了pixiv,其他的rsshub外部站点rss是否可以正常使用。看下clash日志中,是否有连接pixiv,v2ex等外部站点的日志,如果有的话,并且没有报错,那至少说明rsshub上的代理配置是生效的,请求被正确转发,经过clash的代理发送到了目标站点。

在 2024年1月11日,16:05,Mossia @.***> 写道:



你看的那部分clash日志,意思是ttrss通过clash访问rsshub,由于两个在同一个网络下,所以走的直连,这个是预期内的。你想要的是rsshub连接v2ex这些外部站点的时候走clash的是吧,这个需要先确保给rsshub指定了proxy环境变量,默认是有配置的,可以检查下是否正常。

此外,底下贴的pixiv不能访问的日志,这里日志内容是说让你登录,不登陆无法从pixiv获取内容,你可以参考下rsshub的相关文档配置好token再试下

pixiv的token是有配置的,我看docker-compose.yml中也是有给rsshub指定proxy的,所以很纳闷,为啥连接不上

config.yaml

clash config file

version 1.8

last update 2022-11-06

clash config wiki

https://github.com/Dreamacro/clash/wiki/Configuration

Port of HTTP(S) proxy server on the local end

port: 8080

Port of SOCKS5 proxy server on the local end

socks-port: 8085

Transparent proxy server port for Linux and macOS (Redirect TCP and TProxy UDP)

redir-port: 7892

Transparent proxy server port for Linux (TProxy TCP and TProxy UDP)

tproxy-port: 7893

HTTP(S) and SOCKS5 server on the same port

mixed-port: 7890

allow-lan: true bind-address: '*' mode: rule

info / warning / error / debug / silent

log-level: info

ipv6: false

profile: store-selected: false

dns: enable: false

proxies:

proxy-providers: Subscription1: type: http url: '自己的节点' interval: 3600 path: ./server/Subscription1.yaml health-check: enable: true url: https://cp.cloudflare.com/generate_204 interval: 600

Subscription2:
    type: http
    url: '自己的节点'
    interval: 3600
    path: ./server/Subscription2.yaml
    health-check:
        enable: true
        url: https://cp.cloudflare.com/generate_204
        interval: 600

proxy-groups:

url-test select which proxy will be used by benchmarking speed to a URL.

# fallback selects an available policy by priority. The availability is tested by accessing an URL, just like an auto url-test group.
- name: 'fq'
  type: fallback
  use:
      - Subscription1
      - Subscription2
  # lazy: true
  url: 'https://www.google.com/robots.txt'
  interval: 100

- name: 'global-random'
  type: load-balance
  use:
      - Subscription1
      - Subscription2
  proxies:
      - DIRECT
  url: 'http://www.gstatic.com/generate_204'
  interval: 300
  # use round robin to request the url via random proxies
  # or consistent-hashing will request same url via same proxy
  strategy: round-robin

rules:

bypass local

- IP-CIDR,127.0.0.0/8,DIRECT
- IP-CIDR,172.16.0.0/12,DIRECT
- IP-CIDR,10.0.0.0/8,DIRECT
- IP-CIDR,192.168.0.0/16,DIRECT

# hot blocked site
- DOMAIN-KEYWORD,google,fq
- DOMAIN-KEYWORD,wikipedia,fq
- DOMAIN-KEYWORD,facebook,fq
- DOMAIN-KEYWORD,intgram,fq
- DOMAIN-KEYWORD,telegram,fq
- DOMAIN-KEYWORD,pixiv,fq
- DOMAIN-KEYWORD,pximg,fq
- DOMAIN-KEYWORD,v2ex,fq
- DOMAIN,rsshub.app,fq

# anti cloudflare
- IP-CIDR,103.21.244.0/22,global-random
- IP-CIDR,103.22.200.0/22,global-random
- IP-CIDR,103.31.4.0/22,global-random
- IP-CIDR,104.16.0.0/13,global-random
- IP-CIDR,104.24.0.0/14,global-random
- IP-CIDR,108.162.192.0/18,global-random
- IP-CIDR,131.0.72.0/22,global-random
- IP-CIDR,141.101.64.0/18,global-random
- IP-CIDR,162.158.0.0/15,global-random
- IP-CIDR,172.64.0.0/13,global-random
- IP-CIDR,173.245.48.0/20,global-random
- IP-CIDR,188.114.96.0/20,global-random
- IP-CIDR,190.93.240.0/20,global-random
- IP-CIDR,197.234.240.0/22,global-random
- IP-CIDR,198.41.128.0/17,global-random

# anti anti crawler
- DOMAIN-KEYWORD,wechat,global-random
- DOMAIN-KEYWORD,wexin,global-random
- DOMAIN-KEYWORD,douban,global-random
- DOMAIN,wemp.app,global-random
- DOMAIN,ershicimi.com,global-random
- DOMAIN,segmentfault.com,global-random

# default
- GEOIP,CN,DIRECT
- MATCH,global-random

docker-compose.yaml rsshub部分

service.rsshub: image: diygod/rsshub container_name: rsshub depends_on:

rsshub.env

PIXIV_REFRESHTOKEN=xxxxxxxxxx PIXIV_BYPASS_CDN=true PIXIV_BYPASS_DOH=https://doh.pub/dns-query PIXIV_IMG_PROXY=https://i.pixiv.cat

— Reply to this email directly, view it on GitHubhttps://github.com/Colin-XKL/RSSmanX/issues/20#issuecomment-1886576451, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALWYYYLMQ7M2LJR25C54KLDYN6MMJAVCNFSM6AAAAABBV7ZEDSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBWGU3TMNBVGE. You are receiving this because you commented.Message ID: @.***>

MuXia-0326 commented 8 months ago

其他的站点的rss是正常的,就几个需要连接外网的无法使用

Colin-XKL commented 8 months ago

我看你rsshub的配置里,添加了一个自定义字段env_file,这个用法你是从哪里看的😂,正常来说你需要把要定义的环境变量,写在environment字段下面,和PROXY_URI平级那个地方

在 2024年1月11日,16:05,Mossia @.***> 写道:

env_file:

MuXia-0326 commented 8 months ago

docker的官网有写

https://docs.docker.com/compose/environment-variables/set-environment-variables/ image

MuXia-0326 commented 8 months ago

这样配置环境变量应该是生效的,我之前使用也是这样配置的,就昨天看你仓库的yaml更新了,我就重新部署了一下,就出现了这样的问题,之前使用的时候是没问题的,但是之前的config.yaml忘记备份了,但我记得也是加了个pixiv的规则就好了 - DOMAIN-KEYWORD,pixiv,fq

Colin-XKL commented 8 months ago

我看了下文档,只要你的docker compose版本不是太低,这个使用方式应该没问题。

当然你也可以进rsshub容器,使用echo $VARXXX查看对应的环境变量是否被正确配置。

如果只有个别站点出问题,需要你到clash日志寻找对应站点的域名,看是否有对应请求过去,以及该域名具体使用的哪条分流规则

在 2024年1月11日,16:40,Mossia @.***> 写道:



这样配置环境变量应该是生效的,我之前使用也是这样配置的,就昨天看你仓库的yaml更新了,我就重新部署了一下,就出现了这样的问题,之前使用的时候是没问题的,但是之前的config.yaml忘记备份了,但我记得也是加了个pixiv的规则就好了

— Reply to this email directly, view it on GitHubhttps://github.com/Colin-XKL/RSSmanX/issues/20#issuecomment-1886627606, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALWYYYJITTFIMWFR3STNKL3YN6QP3AVCNFSM6AAAAABBV7ZEDSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBWGYZDONRQGY. You are receiving this because you commented.Message ID: @.***>

MuXia-0326 commented 8 months ago

进rsshub容器看了一下, 环境变量是生效了, 至于clash的日志里面,我看到的效果就是, time="2024-01-11T15:03:55+08:00" level=info msg="[TCP] 172.18.0.7:35512 --> rsshub:80 match IPCIDR(172.16.0.0/12) using DIRECT" 这条日志后就没有请求过来了,但我看rsshub里面的PROXY_URI环境变量又是有的,很奇怪

MuXia-0326 commented 8 months ago

我看v2ex的配置是生效的,可能是因为我pixiv配置了这两个东西

PIXIV_BYPASS_CDN=true
PIXIV_BYPASS_DOH=https://doh.pub/dns-query
MuXia-0326 commented 8 months ago

我自己在捣鼓捣鼓吧

Colin-XKL commented 8 months ago

我看报错是Pixiv refresh token failed.,是否是token有问题

MuXia-0326 commented 8 months ago

我看报错是Pixiv refresh token failed.,是否是token有问题

有这种可能性,我自己再试试

MuXia-0326 commented 8 months ago

我好像找到问题了,我把rsshub环境变量中的PROXY_STRATEGY: on_retry注释掉就可以用了,这个环境变量是干什么用途的?

Colin-XKL commented 8 months ago

我好像找到问题了,我把rsshub环境变量中的PROXY_STRATEGY: on_retry注释掉就可以用了,这个环境变量是干什么用途的?

{
    proxyStrategy: envs.PROXY_STRATEGY || 'all', // all / on_retry
}

https://github.com/DIYgod/RSSHub/blob/836aa5b7a95688b4daefa23ff2111f5c5ac47440/lib/config.js#L77

这个变量是控制什么情况下使用代理的,取值有两个 on_retry 、all。 默认为 all,on_retry 的话理论上只有在连接目标站点失败的时候才会尝试使用代理。

我自己的服务器部署在海外,我想让他默认先尝试直连,如果有问题比如被视为爬虫限制了再走 clash 中定义的代理,我之前专门设置这个变量是想解决这个问题。这个是我之前本地测试用的,对于服务器在大陆的情况并不适用,而且这个并不是文档中明确指示的,我是翻源码的时候看到的,其行为并不一定是稳定的预期内的,反而可能会造成访问目标站点时流量不经过代理的情况。repo 中的版本不应该有这个变量,抱歉,我后面把这个变量去掉。

MuXia-0326 commented 8 months ago

问题不大,能找到问题就行 daoli