I believe there is some error with the splits of your data: The test.txt set contains 2299 lines, the train.txt contains 2303 lines and the eval set contains 398 lines.
But according to the original CityPersons dataset: testset consists of 1575 images, trainset consists of 2975 images and valset consists of 500 images, which does not match with the number of lines in your imagesets txtfiles.
I believe there is some error with the splits of your data: The test.txt set contains 2299 lines, the train.txt contains 2303 lines and the eval set contains 398 lines.
But according to the original CityPersons dataset: testset consists of 1575 images, trainset consists of 2975 images and valset consists of 500 images, which does not match with the number of lines in your imagesets txtfiles.