zlckanata / DeepGlobe-Road-Extraction-Challenge

D-LinkNet: LinkNet with Pretrained Encoder and Dilated Convolution for High Resolution Satellite Imagery Road Extraction
http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w4/Zhou_D-LinkNet_LinkNet_With_CVPR_2018_paper.pdf
MIT License
646 stars 195 forks source link

train.py : RuntimeError: invalid argument 1: #2

Closed zxshi closed 6 years ago

zxshi commented 6 years ago

当我运行trian.py的时候出现这个问题,请问北邮大神遇到过吗?需要如何解决? Traceback (most recent call last): File "ttest.py", line 71, in data_loader_iter = iter(data_loader) File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 310, in iter return DataLoaderIter(self) File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 180, in init self._put_indices() File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 219, in _put_indices indices = next(self.sample_iter, None) File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 119, in iter for idx in self.sampler: File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 50, in iter return iter(torch.randperm(len(self.data_source)).long()) RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1512954043090/work/torch/lib/TH/generic/THTensorMath.c:2184

zlckanata commented 6 years ago

应该是数据的问题,建议检查一下 "dataset/train/" 这个文件夹下的数据,如果你把deepglobe的训练数据解压在这里,应该直接就能运行起来。

zxshi commented 6 years ago

好的,谢谢! 1、我先尝试了使用deepglobe的数据进行训练,成功了。 如果用Python3会遇到了这个问题: Traceback (most recent call last): File "train.py", line 44, in for img, mask in data_loader_iter: File "/home/szx/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 210, in next return self._process_next_batch(batch) File "/home/szx/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 230, in _process_next_batch raise batch.exc_type(batch.exc_msg) TypeError: Traceback (most recent call last): File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 42, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/Software/anaconda3/envs/python-3.5/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 42, in samples = collate_fn([dataset[i] for i in batch_indices]) File "/media/files/DeepGlobe-Road-original/data.py", line 125, in getitem id = self.ids[index] TypeError: 'map' object is not subscriptable 解决办法在data.py中将trainlist转为list,见下面第三行 class ImageFolder(data.Dataset): def init(self, trainlist, root): self.ids = list(trainlist) self.loader = default_loader self.root = root 2、我会继续尝试我自己的数据,等有结果了来回复。

zxshi commented 6 years ago

我的数据中标签图片的位深度为8,deepglobe的是24,这会不会是导致错误的原因?

zlckanata commented 6 years ago

如果是用的自己的数据,命名格式很可能和deepglobe的不一样。 从你最开始报的那个错来看,应该是dataloader没有读取到数据列表,也就是train.py的21、22行的imagelist和trainlist都为空,建议你打印len(imagelist)和len(trainlist),应该都是0。 解决方法:建议修改train.py的21、22行,以及data.py的92、93行。 https://github.com/zlkanata/DeepGlobe-Road-Extraction-Challenge/blob/d274cdcb34eb93798f8ec17c7f5a200e70a2b969/train.py#L21-L22 https://github.com/zlkanata/DeepGlobe-Road-Extraction-Challenge/blob/d274cdcb34eb93798f8ec17c7f5a200e70a2b969/data.py#L92-L93

zxshi commented 6 years ago

嗯嗯,谢谢大神。我把图片和标签重命名为deepglobe格式可以运行了。 标签位深度8和24没影响,位深度为8的话需要将背景设为0,另一个标签设为255才有效果(最开始我把背景设为0,标签设为1,用38张图像训练,测试结果全黑,用arcgis打开也是全黑)。 接下来我准备用大批数据进行实验,期待有好的结果。

zlckanata commented 6 years ago

不客气哈~ 因为deepglobe给的标签中,背景是0,道路是255,所以读取标签后,除以了255做”归一化“。 https://github.com/zlkanata/DeepGlobe-Road-Extraction-Challenge/blob/d274cdcb34eb93798f8ec17c7f5a200e70a2b969/data.py#L111 祝有好的结果!

zxshi commented 6 years ago

训练的时候没有使用验证集吗?我把验证集下载后发现只有sat没有标签,是这样吗?

zlckanata commented 6 years ago

官方没有给验证集的标签,需要将验证集的预测标签提交到官网,官方会给一个分数。

zxshi commented 6 years ago

明白,感谢您的回复!

zxshi commented 6 years ago

image image

请问你知道为什么有的地方检测不到吗? 比如上面这种情况,大路检测到了,跨越大路由上延伸到下面的路检测到了一半。 问题1是:小路左侧还有一条路,完全检测不到,感觉不合理啊? 问题2是:检测到一半的这条路大佬能不能指点一下如何优化啊? 感谢大佬回复

zlckanata commented 6 years ago

关于你的问题一:这条道路完全检测不出来,就说明网络以比较高的置信度判断这不是要检测出来的道路,需要查看一下数据集中类似区域是否被标注为道路了。 关于你的问题二:检测到一半其实也是置信度不够高(例如在0.4~0.6之间),但是这里暴力二值化了,道路看起来就像是断了一样。可以对概率图使用crf之类的后处理来平滑整体的置信度,这一步需要小心处理;也可以对二值化后的图使用图算法做处理(参见 https://github.com/snakers4/spacenet-three 的后处理部分)。

zxshi commented 6 years ago

非常感谢!

l53ma commented 4 years ago

Hello, I have same question when running Python3 train.py, here is the error:

Traceback (most recent call last): File "train.py", line 39, in num_workers=4) File "/home/ev1-ws4/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 176, in init sampler = RandomSampler(dataset) File "/home/ev1-ws4/anaconda3/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 66, in init "value, but got num_samples={}".format(self.num_samples)) ValueError: num_samples should be a positive integer value, but got num_samples=0

I output the length of "imagelist" and "trainlist", which is 6226 and 0, respectively.

imagelist = filter(lambda x: x.find('sat')!=-1, os.listdir(ROOT)) #length: 6226 trainlist = map(lambda x: x[:-8], imagelist) #length: 0 x = list(imagelist) y = list(trainlist) print(len(x), len(y))

Any ideas how to solve this problem? Many thanks.