ispamm / MHyEEG

Official PyTorch repository for Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals, ICASSPW 2023.

29 stars 7 forks source link

pandas.errors.ParserError: Error tokenizing data. C error: Expected 44 fields in line 3169, saw 56 #2

Open logicvanlyf opened 4 months ago

logicvanlyf commented 4 months ago

Hello, when I reproduce the code, when I run preprocessing.py file, I get the following error:

Preprocessing: 33%|███▎ | 185/565 [00:52<01:48, 3.49it/s] Traceback (most recent call last): File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 252, in preprocess(sessions_dir, args.save_path, args.verbose) File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 125, in preprocess gaze_df = pd.read_csv(gaze_file, sep='\t', skiprows=23) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 948, in read_csv return _read(filepath_or_buffer, kwds) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 617, in _read return parser.read(nrows) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 1748, in read ) = self._engine.read( # type: ignore[attr-defined] File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read chunks = self._reader.read_low_memory(nrows) File "parsers.pyx", line 843, in pandas._libs.parsers.TextReader.read_low_memory File "parsers.pyx", line 904, in pandas._libs.parsers.TextReader._read_rows File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status File "parsers.pyx", line 2058, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 44 fields in line 3169, saw 56

This error indicates that an error occurred while parsing the data using Pandas. Specifically, it encountered a data row that was parsed to have 56 fields, but the program expected that the row should have 44 fields. This may be caused by some lines in the data file that do not match the format expected by the program.

Excuse me, is there a corresponding solution? If so, please let me know and I would be very grateful.

Xiaochi111 commented 3 months ago

Hello, when I reproduce the code, when I run preprocessing.py file, I get the following error:

Preprocessing: 33%|███▎ | 185/565 [00:52<01:48, 3.49it/s] Traceback (most recent call last): File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 252, in preprocess(sessions_dir, args.save_path, args.verbose) File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 125, in preprocess gaze_df = pd.read_csv(gaze_file, sep='\t', skiprows=23) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 948, in read_csv return _read(filepath_or_buffer, kwds) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 617, in _read return parser.read(nrows) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 1748, in read ) = self._engine.read( # type: ignore[attr-defined] File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read chunks = self._reader.read_low_memory(nrows) File "parsers.pyx", line 843, in pandas._libs.parsers.TextReader.read_low_memory File "parsers.pyx", line 904, in pandas._libs.parsers.TextReader._read_rows File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status File "parsers.pyx", line 2058, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 44 fields in line 3169, saw 56

This error indicates that an error occurred while parsing the data using Pandas. Specifically, it encountered a data row that was parsed to have 56 fields, but the program expected that the row should have 44 fields. This may be caused by some lines in the data file that do not match the format expected by the program.

Excuse me, is there a corresponding solution? If so, please let me know and I would be very grateful.

Hello, I have also encountered this problem. Have you solved it? If you have solved it, can you tell me the solution

KONE544174974 commented 3 months ago

that becase the document P10-Rec1-All-Date-New-Section_30.tsv lose 3 lines of data, so need some tips to recorrect~

balancedzq commented 3 months ago

Hello, when I reproduce the code, when I run preprocessing.py file, I get the following error:

Preprocessing: 33%|███▎ | 185/565 [00:52<01:48, 3.49it/s] Traceback (most recent call last): File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 252, in preprocess(sessions_dir, args.save_path, args.verbose) File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 125, in preprocess gaze_df = pd.read_csv(gaze_file, sep='\t', skiprows=23) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 948, in read_csv return _read(filepath_or_buffer, kwds) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 617, in _read return parser.read(nrows) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 1748, in read ) = self._engine.read( # type: ignore[attr-defined] File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read chunks = self._reader.read_low_memory(nrows) File "parsers.pyx", line 843, in pandas._libs.parsers.TextReader.read_low_memory File "parsers.pyx", line 904, in pandas._libs.parsers.TextReader._read_rows File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status File "parsers.pyx", line 2058, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 44 fields in line 3169, saw 56

This error indicates that an error occurred while parsing the data using Pandas. Specifically, it encountered a data row that was parsed to have 56 fields, but the program expected that the row should have 44 fields. This may be caused by some lines in the data file that do not match the format expected by the program.

Excuse me, is there a corresponding solution? If so, please let me know and I would be very grateful.

Hello, I have also encountered this problem. Have you solved it? If you have solved it, can you tell me the solution

balancedzq commented 3 months ago

Hello, when I reproduce the code, when I run preprocessing.py file, I get the following error: Preprocessing: 33%|███▎ | 185/565 [00:52<01:48, 3.49it/s] Traceback (most recent call last): File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 252, in preprocess(sessions_dir, args.save_path, args.verbose) File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 125, in preprocess gaze_df = pd.read_csv(gaze_file, sep='\t', skiprows=23) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 948, in read_csv return _read(filepath_or_buffer, kwds) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 617, in _read return parser.read(nrows) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 1748, in read ) = self._engine.read( # type: ignore[attr-defined] File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read chunks = self._reader.read_low_memory(nrows) File "parsers.pyx", line 843, in pandas._libs.parsers.TextReader.read_low_memory File "parsers.pyx", line 904, in pandas._libs.parsers.TextReader._read_rows File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status File "parsers.pyx", line 2058, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 44 fields in line 3169, saw 56 This error indicates that an error occurred while parsing the data using Pandas. Specifically, it encountered a data row that was parsed to have 56 fields, but the program expected that the row should have 44 fields. This may be caused by some lines in the data file that do not match the format expected by the program. Excuse me, is there a corresponding solution? If so, please let me know and I would be very grateful.

Hello, I have also encountered this problem. Have you solved it? If you have solved it, can you tell me the solution

Hello, I have also encountered this problem. Have you solved it? If you have solved it, can you tell me the solution

balancedzq commented 3 months ago

因为文件P10-Rec1-All-Date-New-Section_30.tsv丢失了3行数据，所以需要一些提示来重新纠正~

Hello, could you tell me more about how to deal with this problem? thank you

KONE544174974 commented 3 months ago

因为文件P10-Rec1-All-Date-New-Section_30.tsv丢失了3行数据，所以需要一些提示来重新纠正~

Hello, could you tell me more about how to deal with this problem? thank you

i used the simple way to recorect this problem~ just find this document named P10-Rec1-All-Date-New-Section_30.tsv, and then check the line 3159 or 3169 i couldn't remember clearly, but just check around these lines, you will find 3 lins are different, then follow the before or later line to re-add 3 lines. However, need you spent a few time to be familiar with the structure of data documents and mock them~good luck!

balancedzq commented 3 months ago

由于文件P10-Rec1-All-Date-New-Section_30.tsv丢失了3行数据，所以需要一些提示来重新修正~

您好，您能告诉我更多有关如何处理这个问题的信息吗？谢谢

我用简单的方法重新纠正了这个问题~只要找到这个名为P10-Rec1-All-Date-New-Section_30.tsv的文件，然后检查第3159或3169行我记不太清楚了，但只要检查一下这些行即可，你会发现有3行不同，然后按照前面或后面的行重新添加3行。不过，需要你花一些时间来熟悉数据文档的结构并模拟它们~祝你好运！

Hello, I know very little about the original data file, I would be very grateful if you could share your corrected P10-Rec1-All-Date-New-Section_30.tsv file

elelo22 commented 3 months ago

Hi everyone, sorry for the late reply. I tried running the code again but I don't get this error, so maybe the dataset changed? In fact, I looked and it seems I don't have this file P10-Rec1-All-Data-New-Section_30.tsv, I have P10-Rec1-All-Data-New_Section_28.tsv and then P10-Rec1-All-Data-New_Section_32.tsv. I think the fastest solution would be to just skip this file and hopefully others don't have the same problem.

Otherwise, try @KONE544174974's solution, maybe they can give a bit more details as to how they solved the problem/there's a way to do it programmatically which could be shared. I would try to do it but not being able to reproduce/see the problem I can't try to come up with a solution.

aishanii commented 4 weeks ago

hi, while running the main file, i am getting this error. RuntimeError: weight tensor should be defined either for all 3 classes or no classes but got weight tensor of shape: [2]

----Loading dataset---- Dataset: MAHNOB-HCI

Traning samples: 360

Validation samples: 3

Training distribution: [210 150]

wandb: Currently logged in as: vanshuagarwal11-03 (vanshuagarwal11-03-SRM Institute of Science and Technology). Use wandb login --relogin to force relogin wandb: Tracking run with wandb version 0.17.4 wandb: Run data is saved locally in /content/wandb/run-20240708_111041-0am4ciyg wandb: Run wandb offline to turn off syncing. wandb: Syncing run proud-leaf-49 wandb: ⭐️ View project at https://wandb.ai/vanshuagarwal11-03-SRM%20Institute%20of%20Science%20and%20Technology/MHyEEG wandb: 🚀 View run at https://wandb.ai/vanshuagarwal11-03-SRM%20Institute%20of%20Science%20and%20Technology/MHyEEG/runs/0am4ciyg Number of parameters: 19663747

Running on GPU? True - gpu_num: 0 Train round: 0% 0/45 [00:00<?, ?batch/s]tensor([[ 0.0504, -0.0277, -0.0015], [-0.1052, -0.0144, 0.0084], [-0.1097, 0.0798, 0.0005], [-0.0701, -0.0238, -0.0405], [-0.0034, 0.0291, 0.0478], [-0.0992, 0.0290, -0.0440], [-0.1483, -0.0215, -0.0354], [-0.0660, 0.0021, -0.0198]], device='cuda:0', grad_fn=) tensor([2, 1, 2, 2, 1, 1, 2, 1], device='cuda:0') Traceback (most recent call last): File "/content/drive/MyDrive/MHyEEG-main-share/MHyEEG-main/main.py", line 88, in main(args, n_workers) File "/content/drive/MyDrive/MHyEEG-main-share/MHyEEG-main/main.py", line 47, in main trainer.train(train_loader, eval_loader) File "/content/drive/MyDrive/MHyEEG-main-share/MHyEEG-main/training.py", line 92, in train loss = self.criterion(outputs, labels) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/loss.py", line 1185, in forward return F.cross_entropy(input, target, weight=self.weight, File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 3086, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) RuntimeError: weight tensor should be defined either for all 3 classes or no classes but got weight tensor of shape: [2] wandb: 🚀 View run proud-leaf-49 at: https://wandb.ai/vanshuagarwal11-03-SRM%20Institute%20of%20Science%20and%20Technology/MHyEEG/runs/0am4ciyg wandb: ⭐️ View project at: https://wandb.ai/vanshuagarwal11-03-SRM%20Institute%20of%20Science%20and%20Technology/MHyEEG wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20240708_111041-0am4ciyg/logs wandb: WARNING The new W&B backend becomes opt-out in version 0.18.0; try it out with wandb.require("core")! See https://wandb.me/wandb-core for more information.

z1-2y commented 6 days ago

Hello, when I reproduce the code, when I run preprocessing.py file, I get the following error:

Preprocessing: 33%|███▎ | 185/565 [00:52<01:48, 3.49it/s] Traceback (most recent call last): File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 252, in preprocess(sessions_dir, args.save_path, args.verbose) File "D:\Codes\MHyEEG-main\data\preprocessing.py", line 125, in preprocess gaze_df = pd.read_csv(gaze_file, sep='\t', skiprows=23) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 948, in read_csv return _read(filepath_or_buffer, kwds) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 617, in _read return parser.read(nrows) File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\readers.py", line 1748, in read ) = self._engine.read( # type: ignore[attr-defined] File "D:\anaconda3\envs\pytorch_thesis\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read chunks = self._reader.read_low_memory(nrows) File "parsers.pyx", line 843, in pandas._libs.parsers.TextReader.read_low_memory File "parsers.pyx", line 904, in pandas._libs.parsers.TextReader._read_rows File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status File "parsers.pyx", line 2058, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 44 fields in line 3169, saw 56

This error indicates that an error occurred while parsing the data using Pandas. Specifically, it encountered a data row that was parsed to have 56 fields, but the program expected that the row should have 44 fields. This may be caused by some lines in the data file that do not match the format expected by the program.

Excuse me, is there a corresponding solution? If so, please let me know and I would be very grateful.

Hello, have you solved this problem?