zzzzjx commented 3 years ago

你好，看了您的论文，我很受启发。在复现您代码用其它数据集时，我遇到了这个错误： if not self.is_training: assert idx == idx_new, f'idx {idx} != idx_new {idx_new} during testing.' AssertionError: idx 3 != idx_new 762 during testing.

请问我是还需要修改哪里吗？期待您的回复

FangShancheng commented 3 years ago

@zzzzjx 您好，这是由于你测试数据中，有些数据没通过数据校验导致的。您可以检查下dataset.py 中调用 self._next_image(idx)的一些校验测试数据的部分是不是通过了。

lyc728 commented 2 years ago

请问你后面解决了吗？我是自己的中文的数据集，也报错

YYQ-wxsy commented 1 year ago

你好，我也遇到了一样的问题，请问具体是怎么解决的呢？

shivpoojansaini commented 1 year ago

I also encountered the same problem : this error occurred due to one of the following conditions under dataset.py `def getitem(self, idx): image, text, idx_new = self.get(idx)

print(image, text, idx_new, idx)

    if not self.is_training: assert idx == idx_new, f'idx {idx} != idx_new {idx_new} during testing.'`

it will call self.get(idx) `def get(self, idx): with self.env.begin(write=False) as txn: image_key, label_key = f'image-{idx + 1:09d}', f'label-{idx + 1:09d}' try: label = str(txn.get(label_key.encode()), 'utf-8').strip() # label if not set(label).issubset(self.character): return self._next_image(idx)

label = re.sub('[^0-9a-zA-Z]+', '', label)

            if self.check_length and self.max_length > 0:
                if len(label) > self.max_length or len(label) <= 0:
                    # logging.info(f'Long or short text image is found: {self.name}, {idx}, {label}, {len(label)}')
                    return self._next_image(idx)
            label = label[:self.max_length] `

then internally it will call self._next_image(idx) which will genrate one random index from dataset for now being i just pass these case during validation face by updating if not self.is_training: if idx!=idx_new: pass under def getitem(self, idx): of dataset.py

neverstoplearn commented 1 year ago

@zzzzjx 您好，这是由于你测试数据中，有些数据没通过数据校验导致的。您可以检查下dataset.py 中调用 self._next_image(idx)的一些校验测试数据的部分是不是通过了。

使用CUDA_VISIBLE_DEVICES=0 python main.py --config=configs/train_abinet.yaml --phase test --test_root data/evaluation/IIIT5k_3000/ --image_only 命令，提示： 2023-03-30 20:20:13,250 main.py:179 INFO train-abinet] Read model from best-train-abinet [2023-03-30 20:20:13,251 main.py:237 INFO train-abinet] Start validate Traceback (most recent call last):
File "main.py", line 246, in main() File "main.py", line 238, in main last_metrics = learner.validate() File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/fastai/basic_train.py", line 391, in validate val_metrics = validate(self.model, dl, self.loss_func, cb_handler) File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/fastai/basic_train.py", line 57, in validate for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)): File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/fastprogress/fastprogress.py", line 50, in iter raise e File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/fastprogress/fastprogress.py", line 41, in iter for i,o in enumerate(self.gen): File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/fastai/basic_data.py", line 75, in iter for b in self.dl: yield self.proc_batch(b) File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in next data = self._next_data() File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data return self._process_data(data) File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data data.reraise() File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/_utils.py", line 429, in reraise raise self.exc_type(msg) AssertionError: Caught AssertionError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/xxx/anaconda3/envs/torch182/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 219, in getitem return self.datasets[dataset_idx][sample_idx] File "/home/xxx/ABINet/dataset.py", line 147, in getitem if not self.is_training: assert idx == idx_new, f'idx {idx} != idx_new {idx_new} during testing.' AssertionError: idx 0 != idx_new 376 during testing. 使用的是预训练模型，可能是什么问题？

anthonyAndchen commented 1 year ago

I also encountered the same problem : this error occurred due to one of the following conditions under dataset.py def __getitem__(self, idx): image, text, idx_new = self.get(idx) # print(image, text, idx_new, idx) if not self.is_training: assert idx == idx_new, f'idx {idx} != idx_new {idx_new} during testing.'

it will call self.get(idx) def get(self, idx): with self.env.begin(write=False) as txn: image_key, label_key = f'image-{idx + 1:09d}', f'label-{idx + 1:09d}' try: label = str(txn.get(label_key.encode()), 'utf-8').strip() # label if not set(label).issubset(self.character): return self._next_image(idx) # label = re.sub('[^0-9a-zA-Z]+', '', label) if self.check_length and self.max_length > 0: if len(label) > self.max_length or len(label) <= 0: # logging.info(f'Long or short text image is found: {self.name}, {idx}, {label}, {len(label)}') return self._next_image(idx) label = label[:self.max_length] then internally it will call self._next_image(idx) which will genrate one random index from dataset for now being i just pass these case during validation face by updating if not self.is_training: if idx!=idx_new: pass under def getitem(self, idx): of dataset.py

Hello, I have also encountered this issue. I will now replace

if not self.is_training: assert idx == idx_new, f'idx {idx} != idx_new {idx_new} during testing.'

with

if not self.is_training: 
            if idx!=idx_new: pass

But during validation, new errors will occur

Traceback (most recent call last):
  File "main.py", line 246, in <module>
    main()
  File "main.py", line 234, in main
    learner.fit(epochs=config.training_epochs,
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/basic_train.py", line 200, in fit
    fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/basic_train.py", line 102, in fit
    if cb_handler.on_batch_end(loss): break
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/callback.py", line 308, in on_batch_end
    self('batch_end', call_mets = not self.state_dict['train'])
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/callback.py", line 251, in __call__
    for cb in self.callbacks: self._call_and_update(cb, cb_name, **kwargs)
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/callback.py", line 241, in _call_and_update
    new = ifnone(getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs), dict())
  File "/home/amax/hongwei/project/ABINet/callbacks.py", line 117, in on_batch_end
    last_metrics = self._validate()
  File "/home/amax/hongwei/project/ABINet/callbacks.py", line 65, in _validate
    val_metrics = validate(self.learn.model, dl, self.loss_func, cb_handler)
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/basic_train.py", line 63, in validate
    if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/callback.py", line 308, in on_batch_end
    self('batch_end', call_mets = not self.state_dict['train'])
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/callback.py", line 250, in __call__
    for met in self.metrics: self._call_and_update(met, cb_name, **kwargs)
  File "/home/amax/anaconda3/envs/vitstr_hongwei/lib/python3.8/site-packages/fastai/callback.py", line 241, in _call_and_update
    new = ifnone(getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs), dict())
  File "/home/amax/hongwei/project/ABINet/callbacks.py", line 205, in on_batch_end
    assert (pt_lengths == pt_lengths_).all(), f'{pt_lengths} != {pt_lengths_} for {pt_text}'
AssertionError: tensor([0], device='cuda:0') != tensor([26], device='cuda:0') for ['aaaaaaaaaaaaaaaaaaaaaaaaaa']

May I ask what the problem may be?

FangShancheng / ABINet

AssertionError #34

print(image, text, idx_new, idx)

label = re.sub('[^0-9a-zA-Z]+', '', label)