shenweichen / DeepCTR

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .
https://deepctr-doc.readthedocs.io/en/latest/index.html
Apache License 2.0
7.44k stars 2.19k forks source link

EDCN应该每次传入cross_in和初始第0层的cross_in,否则不符合DCN本身数学推导 #509

Open TangJiakai opened 1 year ago

TangJiakai commented 1 year ago

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述) A clear and concise description of what the question is.

Additional context Add any other context about the problem here.

Operating environment(运行环境):

shuDaoNan9 commented 1 year ago

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述) A clear and concise description of what the question is.

Additional context Add any other context about the problem here.

Operating environment(运行环境):

  • python version [e.g. 3.6]
  • tensorflow version [e.g. 1.4.0, 1.15.0, 2.10.0]
  • deepctr version [e.g. 0.9.2,]

大佬可知edcn.py中为何只用了离散类别特征,而不使用数值特征么?我看之前的dcn.py是两种特征都使用的,但这里RegulationModule却只用3维embeding的类别特征输入

shuDaoNan9 commented 1 year ago

刚刚看到论文相关部分,居然将数值特征都离散化处理了:For numerical features (e.g., bidding price, usage count), commonused approaches are discretization, including soft discretization like AutoDis [4] and hard discretization via transforming numerical features to categorical features, such as logarithm discretization [13] and tree-based discretization [8].

TangJiakai commented 1 year ago

连续数值离散化操作 应该还是很常见的

发自我的iPhone

------------------ 原始邮件 ------------------ 发件人: shuDaoNan9 @.> 发送时间: 2023年3月16日 15:36 收件人: shenweichen/DeepCTR @.> 抄送: Jiakai Tang @.>, Author @.> 主题: Re: [shenweichen/DeepCTR] EDCN应该每次传入cross_in和初始第0层的cross_in,否则不符合DCN本身数学推导 (Issue #509)

刚刚看到论文相关部分,居然将数值特征都离散化处理了:For numerical features (e.g., bidding price, usage count), commonused approaches are discretization, including soft discretization like AutoDis [4] and hard discretization via transforming numerical features to categorical features, such as logarithm discretization [13] and tree-based discretization [8].

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

shuDaoNan9 commented 1 year ago

连续数值离散化操作 应该还是很常见的 发自我的iPhone ------------------ 原始邮件 ------------------ 发件人: shuDaoNan9 @.> 发送时间: 2023年3月16日 15:36 收件人: shenweichen/DeepCTR @.> 抄送: Jiakai Tang @.>, Author @.> 主题: Re: [shenweichen/DeepCTR] EDCN应该每次传入cross_in和初始第0层的cross_in,否则不符合DCN本身数学推导 (Issue #509) 刚刚看到论文相关部分,居然将数值特征都离散化处理了:For numerical features (e.g., bidding price, usage count), commonused approaches are discretization, including soft discretization like AutoDis [4] and hard discretization via transforming numerical features to categorical features, such as logarithm discretization [13] and tree-based discretization [8]. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

之前deepctr中的模型基本都是直接用归一化的数值特征,这里突然开始做离散化了,一开始没理解就只能去翻论文了,开始还想偷懒懒得看论文来着o(╯□╰)o

TangJiakai commented 1 year ago

我也没看,这么多ctr 序列模型论文都看 太累了😭 

发自我的iPhone

------------------ 原始邮件 ------------------ 发件人: shuDaoNan9 @.> 发送时间: 2023年3月16日 16:06 收件人: shenweichen/DeepCTR @.> 抄送: Jiakai Tang @.>, Author @.> 主题: Re: [shenweichen/DeepCTR] EDCN应该每次传入cross_in和初始第0层的cross_in,否则不符合DCN本身数学推导 (Issue #509)

连续数值离散化操作 应该还是很常见的 发自我的iPhone … ------------------ 原始邮件 ------------------ 发件人: shuDaoNan9 @.> 发送时间: 2023年3月16日 15:36 收件人: shenweichen/DeepCTR @.> 抄送: Jiakai Tang @.>, Author @.> 主题: Re: [shenweichen/DeepCTR] EDCN应该每次传入cross_in和初始第0层的cross_in,否则不符合DCN本身数学推导 (Issue #509) 刚刚看到论文相关部分,居然将数值特征都离散化处理了:For numerical features (e.g., bidding price, usage count), commonused approaches are discretization, including soft discretization like AutoDis [4] and hard discretization via transforming numerical features to categorical features, such as logarithm discretization [13] and tree-based discretization [8]. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

之前deepctr中的模型基本都是直接用归一化的数值特征,这里突然开始做离散化了,一开始没理解就只能去翻论文了,开始还想偷懒懒得看论文来着o(╯□╰)o

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>