aigc-apps / AMFormer

The AMFormer algorithm, accepted at AAAI-2024, for deep tabular learning
GNU General Public License v3.0
23 stars 7 forks source link

python main.py --config config/run/ours_fttrans-hcdr.yaml报错和直接运行main.py的问题 #11

Open Cgetier520990 opened 1 week ago

Cgetier520990 commented 1 week ago

作者您好,安装您的指示,不知道为什么会报这样的错? (AMFormer) PS E:\PYTHON_PROGRAMME\tabular learn\AMFormer-main> python main.py --config config/run/ours_fttrans-hcdr.yaml ================ Loading Config ================ Traceback (most recent call last): File "E:\PYTHON_PROGRAMME\tabular learn\AMFormer-main\main.py", line 68, in args = args.initialize() ^^^^^^^^^^^^^^^^^ File "E:\PYTHON_PROGRAMME\tabular learn\AMFormer-main\config\cfg.py", line 63, in initialize config = self.load_base(derived_config, config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\PYTHON_PROGRAMME\tabular learn\AMFormer-main\config\cfg.py", line 35, in load_base derivedconfig = yaml.safe_load(f) ^^^^^^^^^^^^^^^^^ File "D:\anaconda\envs\AMFormer\Lib\site-packages\yaml__init__.py", line 125, in safe_load return load(stream, SafeLoader) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\envs\AMFormer\Lib\site-packages\yaml__init.py", line 79, in load loader = Loader(stream) ^^^^^^^^^^^^^^ File "D:\anaconda\envs\AMFormer\Lib\site-packages\yaml\loader.py", line 34, in init Reader.init(self, stream) File "D:\anaconda\envs\AMFormer\Lib\site-packages\yaml\reader.py", line 85, in init__ self.determine_encoding() File "D:\anaconda\envs\AMFormer\Lib\site-packages\yaml\reader.py", line 124, in determine_encoding self.update_raw() File "D:\anaconda\envs\AMFormer\Lib\site-packages\yaml\reader.py", line 178, in update_raw data = self.stream.read(size) ^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 278: illegal multibyte sequence

此外,在运行main.py时,报错“Namespace”没有“model_name”,我在config/configs/base.yaml中加了“model_name: xu”就不报错了,但是报错AttributeError: module 'utils.data_load' has no attribute 'pretrain',应该是config/configs/base.yaml中data_name: pretrain字段的问题,我查看了utils/data_load文件,也不知道怎么改

Ch3ngY1 commented 1 week ago

你这边报错信息显示UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 278: illegal multibyte sequence,这应该是编码的问题,这里的model_name: xu是你自己定义的模型吗?关于你说的data_name:pretrain这里,我在load config中写了config的继承关系,datasets里的config的中的data_name字段会覆盖base中的这个字段的

Cgetier520990 commented 1 week ago

你这边报错信息显示UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 278: illegal multibyte sequence,这应该是编码的问题,这里的model_name: xu是你自己定义的模型吗?关于你说的data_name:pretrain这里,我在load config中写了config的继承关系,datasets里的config的中的data_name字段会覆盖base中的这个字段的

嗯嗯,这个gbk的问题我解决了,在读取each那里,需要加一个“utf-8”。“model_name: xu是你自己定义的模型吗?关于你说的data_name:pretrain这里,我在load config中写了config的继承关系,datasets里的config的中”,这个我也看到了,直接运行main函数不行,必须要加---config config/run/ours_fttrans-hcdr.yaml,现在貌似就是数据集,对不上,不知道是怎么处理数据的

Ch3ngY1 commented 1 week ago

数据输入格式可以参考README中的输入输出格式那一部分,代码中对应的是数据类中的getitem这个函数,返回一系列cate_id,数值型数据,还有label,这边需要做一下categorical数据到id的转化,然后在datasets这个yaml文件里指定num_cate和num_cont,分别对应数据的cate类别数量还有cont的类别数量