FedNLP: An Industry and Research Integrated Platform for Federated Learning in Natural Language Processing, Backed by FedML, Inc. The Previous Research Version is Accepted to NAACL 2022
KeyError: 'Unable to open object (bad heap free list)' #29

Open ysgncss opened 2 years ago

ysgncss commented 2 years ago

When I use 20news for classification, I get this error, can anyone help me? I have got the dataset from here. https://fednlp.s3-us-west-1.amazonaws.com/partition_files/20news_partition.h5 https://fednlp.s3-us-west-1.amazonaws.com/data_files/20news_data.h5

Loading data from h5 file.: 0%| | 0/11314 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/miniconda3/envs/fednlp/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/miniconda3/envs/fednlp/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/FedNLP-master/experiments/centralized/transformer_exps/main_tc.py", line 91, in train_dl, test_dl = dm.load_centralized_data() File "/home/FedNLP-master/data_manager/base_data_manager.py", line 112, in load_centralized_data train_data = self.read_instance_from_h5(data_file, train_index_list) File "/home/FedNLP-master/data_manager/text_classification_data_manager.py", line 23, in read_instance_from_h5 X.append(data_file["X"][str(idx)][()].decode("utf-8")) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "/root/miniconda3/envs/fednlp/lib/python3.7/site-packages/h5py/_hl/group.py", line 305, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open KeyError: 'Unable to open object (bad heap free list)'

DeviRule commented 2 years ago

Please check your h5py version and match it with our h5py in the requirement.txt

ysgncss commented 2 years ago

There are still the same mistakes when I run this sh. h5py==3.1.0

DATA_NAME=20news CUDA_VISIBLE_DEVICES=1 python -m experiments.centralized.transformer_exps.main_tc \ --dataset ${DATA_NAME} \ --data_file ~/fednlp_data/data_files/${DATA_NAME}_data.h5 \ --partition_file ~/fednlp_data/partition_files/${DATA_NAME}_partition.h5 \ --partition_method niid_label_clients=100_alpha=1.0 \ --model_type distilbert \ --model_name distilbert-base-uncased \ --do_lower_case True \ --train_batch_size 8 \ --eval_batch_size 8 \ --max_seq_length 256 \ --learning_rate 1e-1 \ --epochs 20 \ --evaluate_during_training_steps 500 \ --output_dir /tmp/${DATA_NAME}_fed/ \ --n_gpu 1

MrigankRaman commented 2 years ago

Hi! We at FedML have launched a new platform for FedNLP where there should be no such issue. Can you please check whether you face the same issue there? Here is the new FedNLP platform: https://github.com/FedML-AI/FedML/tree/master/python/app/fednlp