Bartzi / see

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
GNU General Public License v3.0
575 stars 147 forks source link

I got a question about the train_svhn.py #105

Open zimo99 opened 3 years ago

zimo99 commented 3 years ago

python train_svhn.py ../datasets/svhn/jsonfile/svhn_curriculum_specification.json ../datasets/svhn/runningLog/ -g 0 --char-map ../datasets/svhn/svhn_char_map.json -b 10 Traceback (most recent call last): File "train_fsns.py", line 84, in train_dataset, validation_dataset = curriculum.load_dataset(0) File "/home/donglong_5/SEE/see-master/chainer/utils/baby_step_curriculum.py", line 40, in load_dataset train_dataset = self.dataset_class(self.train_curriculum[level], **self.dataset_args) File "/home/donglong_5/SEE/see-master/chainer/datasets/file_dataset.py", line 31, in init self.num_timesteps, self.num_labels = (int(i) for i in next(reader)) File "/home/donglong_5/SEE/see-master/chainer/datasets/file_dataset.py", line 31, in self.num_timesteps, self.num_labels = (int(i) for i in next(reader)) ValueError: invalid literal for int() with base 10: '1 2'

Bartzi commented 3 years ago

Well, you did not separate the digits in the first line by a tab (\t). The csv needs to be a tsv.

zimo99 commented 3 years ago

Thank you very much. I'll try it now.

zimo99 commented 3 years ago

When changed to tsv, like this: 4 4 /home/donglong_5/SEE/see-master/datasets/svhn/gridDataset/train/0.png 0 9 10 10 2 0 10 10 0 6 10 10 0 8 10 10 /home/donglong_5/SEE/see-master/datasets/svhn/gridDataset/train/1.png 0 9 10 10 0 1 10 10 4 4 10 10 8 10 10 10 ... But, after trying again, there was still problem: python train_svhn.py ../datasets/svhn/jsonfile/svhn_curriculum_specification.json ../datasets/svhn/runningLog/ -g 0 --char-map ../datasets/svhn/svhn_char_map.json --blank-label 0 -b 128 Traceback (most recent call last): File "train_svhn.py", line 76, in train_dataset, validation_dataset = curriculum.load_dataset(0) File "/home/donglong_5/SEE/see-master/chainer/utils/baby_step_curriculum.py", line 40, in load_dataset train_dataset = self.dataset_class(self.train_curriculum[level], **self.dataset_args) File "/home/donglong_5/SEE/see-master/chainer/datasets/file_dataset.py", line 31, in init self.num_timesteps, self.num_labels = (int(i) for i in next(reader)) File "/home/donglong_5/SEE/see-master/chainer/datasets/file_dataset.py", line 31, in self.num_timesteps, self.num_labels = (int(i) for i in next(reader)) ValueError: invalid literal for int() with base 10: '4 4'

(Ps:This is the changed JOSN file {         "train": "/home/donglong_5/SEE/see-master/datasets/svhn/gridDataset/train.tsv",         "validation": "/home/donglong_5/SEE/see-master/datasets/svhn/gridDataset/valid.tsv"     }

Bartzi commented 3 years ago

okay, did you also change the delimiter between all values in the tsv file? All items need to be delimited by a tab character.

zimo99 commented 3 years ago

thank (^▽^)

WangzekunY commented 3 years ago

I want to reproduce the results with 95.2% accuracy obtained in the paper(SVHN), but I don't quite understand which dataset use as a training set and a validation set, how do you get your val set and whether you can reproduce the results of the original paper?