TaleLin / lin-cms-flask

🎀A simple and practical CMS implememted by Flask
http://doc.cms.talelin.com/
Other
832 stars 216 forks source link

Gunicorn生产环境JSON视图编码递归错误RecursionError: maximum recursion depth exceeded in comparison #171

Closed LeanderChen closed 3 years ago

LeanderChen commented 3 years ago
  1. 问题标题: Gunicorn生产环境JSON视图编码递归错误RecursionError: maximum recursion depth exceeded in comparison
  2. 问题环境:
  3. 非关键环境:
    • 服务器面板:宝塔 7.7.0
    • 项目管理器:Python项目管理器1.9
  4. 复现步骤:

(1) 安装python 3.6.9解释器(生产采用宝塔国内镜像) (2) 创建项目(以下步骤通过python项目管理器一键完成,等价于以下手动步骤)

- 安装虚拟环境xxx_env
- 根据生产环境设施,修改.flaskenv环境`FLASK_ENV=production`,修改.production.env(mysql数据库,内容略)
- 虚拟环境安装依赖`pip install -r requirements-prod.txt`  

(3) 启动项目(以下步骤通过python项目管理器一键完成,等价于以下手动步骤)

- `gunicorn -c gunicorn.conf`(配置文件见下)  

(4) 个例测试环境正常接口出错,debug日志信息如下:

2021-09-23 22:16:41,140 ERROR 31456   ---  [Dummy-1] - Traceback (most recent call last):
  File "/www/CAMS/cams_venv/lib/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/www/CAMS/cams_venv/lib/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/www/CAMS/fac/apidoc.py", line 84, in validation
    **kwargs,
  File "/www/CAMS/cams_venv/lib/python3.6/site-packages/spectree/plugins/flask_plugin.py", line 157, in validate
    response = make_response(func(*args, **kwargs))
  File "/www/CAMS/cams_venv/lib/python3.6/site-packages/flask/helpers.py", line 223, in make_response
    return current_app.make_response(args)
  File "/www/CAMS/fac/encoder.py", line 45, in make_lin_response
    o = jsonify(o)
  File "/www/CAMS/cams_venv/lib/python3.6/site-packages/flask/json/__init__.py", line 370, in jsonify
    dumps(data, indent=indent, separators=separators) + "\n",
  File "/www/CAMS/cams_venv/lib/python3.6/site-packages/flask/json/__init__.py", line 211, in dumps
    rv = _json.dumps(obj, **kwargs)
  File "/root/.pyenv/versions/3.6.9/lib/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/root/.pyenv/versions/3.6.9/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/root/.pyenv/versions/3.6.9/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/www/CAMS/fac/encoder.py", line 35, in default
    return JSONEncoder.default(self, o)
  File "/www/CAMS/fac/encoder.py", line 35, in default
    return JSONEncoder.default(self, o)
  File "/www/CAMS/fac/encoder.py", line 35, in default
    return JSONEncoder.default(self, o)
  [Previous line repeated 935 more times]
  File "/www/CAMS/fac/encoder.py", line 24, in default
    if isinstance(o, BaseModel):
  File "/root/.pyenv/versions/3.6.9/lib/python3.6/abc.py", line 190, in __instancecheck__
    subclass in cls._abc_negative_cache):
  File "/root/.pyenv/versions/3.6.9/lib/python3.6/_weakrefset.py", line 75, in __contains__
    return wr in self.data
RecursionError: maximum recursion depth exceeded in comparison
  1. 补充文件:

(1) gunicorn.conf

from gevent import monkey

monkey.patch_all(thread=False)

import os
import multiprocessing

#==== 服配置 ====
proc_name = 'cams'
#gunicorn控的接口
bind = '0.0.0.0:5000'
#程行用
user = 'www'
#工作目录
chdir = '/www/xxx/'
#以守程方式工作
daemon = False
debug = True
#gunicorn程id,kill掉文件的id,gunicorn就停止
pidfile = chdir + '/logs/cams.pid'
#https://github.com/benoitc/gunicorn/issues/1194
#keepalive = 75 # needs to be longer than the ELB idle timeout
##about timeout issuses
#https://github.com/benoitc/gunicorn/issues/1440
#https://github.com/globaldigitalheritage/arches-3d/issues/54
#https://github.com/benoitc/gunicorn/issues/588
#https://github.com/benoitc/gunicorn/issues/1194
#https://github.com/benoitc/gunicorn/issues/942
#https://stackoverflow.com/questions/10855197/gunicorn-worker-timeout-error

#==== 理器配置(程、程,及理器) ====
workers = multiprocessing.cpu_count() * 2 + 1
threads = 2
worker_class = 'geventwebsocket.gunicorn.workers.GeventWebSocketWorker'
worker_connections = 2000

#==== 日志相配置 ====
loglevel = 'warning'
access_log_format = '%(t)s %(p)s %(h)s "%(r)s" %(s)s %(L)s %(b)s %(f)s" "%(a)s"'
backlog = 512
errorlog = chdir + '/logs/error.log'
accesslog = chdir + '/logs/access.log'
timeout = 10

#==== 拓展日志配置 ====
#access日志配置,更配置看:https://docs.gunicorn.org/en/stable/settings.html#logging
#`%(a)s`考示例:'%(a)s "%(b)s" %(c)s' % {'a': 1, 'b' : -2, 'c': 'c'}
#如下配置,打印ip、求方式、求url路、求http、求、求的user agent、求耗
#示例:[2020-08-19 19:18:19 +0800] [50986]: [INFO] 127.0.0.1 POST /test/v1.0 HTTP/1.1 200 PostmanRuntime/7.26.3 0.088525
#access_log_format = "%(h)s %(r)s %(s)s %(a)s %(L)s"

#https://github.com/benoitc/gunicorn/issues/2250
# logconfig_dict = {
#     'version':1,
#     'disable_existing_loggers': False,
#     #在最新版本必添加root配置,否出Error: Unable to configure root logger
#     "root": {
#           "level": "DEBUG",
#           "handlers": ["console"] # handlers字典的(key)
#     },
#     'loggers':{
#         "gunicorn.error": {
#             "level": "DEBUG",# 打日志的等;
#             "handlers": ["error_file"], # handlers字典的(key);
#             #是否日志打印到控制台(console),若True(或1),打印在supervisor日志控文件logfile上,于非常好用;
#             "propagate": 0, 
#             "qualname": "gunicorn_error"
#         },

#         "gunicorn.access": {
#             "level": "DEBUG",
#             "handlers": ["access_file"],
#             "propagate": 0,
#             "qualname": "access"
#         }
#     },
#     'handlers':{
#         "error_file": {
#             "class": "logging.handlers.RotatingFileHandler",
#             "maxBytes": 1024*1024*10,# 打日志的大小(此限制100mb)
#             "backupCount": 5,# 份量(若需限制日志大小,必存在值,且最小正整)
#             "formatter": "generic",# formatters字典的(key)
#             "filename": chdir + "/logs/error.log" #若配置特需求,需修改此路
#         },
#         "access_file": {
#             "class": "logging.handlers.RotatingFileHandler",
#             "maxBytes": 1024*1024*50,
#             "backupCount": 3,
#             "formatter": "generic",
#             "filename": "/logs/access.log", #若配置特需求,需修改此路
#         },
#         'console': {
#             'class': 'logging.StreamHandler',
#             'level': 'DEBUG',
#             'formatter': 'generic',
#         },

#     },
#     'formatters':{
#         "generic": {
#             "format": "%(asctime)s [%(process)d]: [%(levelname)s] %(message)s", # 打日志的格式
#             "datefmt": "[%Y-%m-%d %H:%M:%S %z]",# 示格式
#             "class": "logging.Formatter"
#         }
#     }
# }

PS: 社区朋友提出猴子补丁的非阻塞改造,以及补丁引入顺序可能造成递归超限错误,故此作了1-6行引入。

(2) 异常接口在测试环境(未使用gunicorn的development环境)的响应结果:

{
  "count": 5,
  "items": [
    {
      "_administrator": {
        "avatar": "http://127.0.0.1:5000/assets/",
        "email": "",
        "id": 1,
        "nickname": "",
        "username": "root"
      },
      "_district": {
        "code": "default",
        "create_time": "2021-09-23T11:51:53Z",
        "id": 1,
        "mount": true,
        "name": "默认区域",
        "update_time": "2021-09-23T11:51:53Z"
      },
      "_district_current": {
        "code": "default",
        "create_time": "2021-09-23T11:51:53Z",
        "id": 1,
        "mount": true,
        "name": "默认区域",
        "update_time": "2021-09-23T11:51:53Z"
      },
      "_owner_org": {
        "code": "default",
        "create_time": "2021-09-23T11:51:53Z",
        "id": 1,
        "name": "您的组织",
        "note": "",
        "parent": 0,
        "update_time": "2021-09-23T11:51:53Z"
      },
      "_user": {
        "avatar": "http://127.0.0.1:5000/assets/",
        "email": "",
        "id": 1,
        "nickname": "",
        "username": "root"
      },
      "_user_dept": {
        "code": "default",
        "create_time": "2021-09-23T11:51:53Z",
        "id": 1,
        "name": "您的组织",
        "note": "",
        "parent": 0,
        "update_time": "2021-09-23T11:51:53Z"
      },
      "_user_org": {
        "code": "default",
        "create_time": "2021-09-23T11:51:53Z",
        "id": 1,
        "name": "您的组织",
        "note": "",
        "parent": 0,
        "update_time": "2021-09-23T11:51:53Z"
      },
      "administrator": 1,
      "amodel": "中号新型教研室办公桌",
      "assets_district_id": 1,
      "assets_district_id_current": 1,
      "assets_status": 1,
      "attachments": "[]",
      "catid": 1,
      "code": "MX-CNN-20200701152",
      "create_time": "2021-09-23T11:51:53Z",
      "district_id": 1,
      "epc": "",
      "id": 1,
      "input_time": "2021-09-23T11:51:53Z",
      "location": "教学楼贮藏库",
      "model": "枣红色标准办公桌",
      "name": "办公桌",
      "note": "这是测试资产,在开始使用前应删除",
      "owner_org": 1,
      "photo": "",
      "price": 850,
      "sn": "SN200701213",
      "source": "竞争性采购",
      "unit": "张",
      "user": 1,
      "user_dept": 1,
      "user_org": 1,
      "valid_mons": 60
    }
  ],
  "page": 0,
  "total": 1,
  "total_page": 1
}
sunlin92 commented 3 years ago

从报错中无法定位问题,以下是一点猜想和建议:

  1. pip install --upgrade gevent 更新一下gevent版本
  2. 将gevent monky放到文件最上两行
    from gevent import monkey 
    monkey.patch_all()
    ...
  3. 排除猴子补丁的问题,把monkey.patch_all()整行注释掉试试
  4. 将gunicorn换成uwsgi部署
  5. 返回的JSON结构过于混乱,重新梳理简化一下数据结构有可能解决问题
LeanderChen commented 3 years ago

感谢作者和社区的关注!已迂回解决,但未知原因。

全部报错信息和其他可用信息(包括排查过OS、性能参数等)确实无法诊断原因,最终的解决是把响应信息列表中一个字段在包装json响应前,从数值类型转换为字符串类型。 由于问题特殊性,复现条件比较苛刻,便不再深究。以下问题关键词,不在提问问题实际因素(无关)中:

初步推测原因,在业务层、框架模型层,对视图响应包装某处处理不规范。要忙着项目进度了,Bug链路追踪先按下不表。