Closed sm807983636 closed 4 years ago
具体的报错内容如下:
C:/w/1/s/windows/pytorch/aten/src/THCUNN/BCECriterion.cu:42: block: [0,0,0], thr
ead: [215,0,0] Assertion input >= 0. && input <= 1.
failed.
C:/w/1/s/windows/pytorch/aten/src/THCUNN/BCECriterion.cu:42: block: [0,0,0], thr
ead: [216,0,0] Assertion input >= 0. && input <= 1.
failed.
...
C:/w/1/s/windows/pytorch/aten/src/THCUNN/BCECriterion.cu:42: block: [0,0,0], thr
ead: [190,0,0] Assertion input >= 0. && input <= 1.
failed.
C:/w/1/s/windows/pytorch/aten/src/THCUNN/BCECriterion.cu:42: block: [0,0,0], thr
ead: [191,0,0] Assertion input >= 0. && input <= 1.
failed.
Traceback (most recent call last):
File "F:/Data-group/experiments/online_main.py", line 493, in
我还想请问下输出界面“Model Interpreting...”下面一行的向量输出语句是在处于代码的哪里?成功运行时,向量各项元素均大于0,而运行失败时出现了0元素。
你可以尝试pdb单步调试或打印出参与BCELoss计算的tensor,检查是否真的如报错所示超过区间;print可以用查找定位,可能是 https://github.com/motefly/DeepGBM/blob/master/tree_model_interpreter.py#L110 。
您好,我修改了nslices便可成功运行,请问该参数是否跟数据集的大小有关?
该参数用于分组,组数应根据树的数量进行调整。
感谢解答!
当我进行二分类在线训练和预测任务的时候,出现了如下报错: RuntimeError: reduce failed to synchronize: cudaErrorAssert: device-side assert triggered 说是使用BCELoss的时候,tensor范围超出了[0, 1],我也上网查找了许多方法,并且对https://github.com/motefly/DeepGBM/blob/master/experiments/helper.py中TrainWithLog函数的outputs,以及https://github.com/motefly/DeepGBM/blob/master/experiments/models/components.py中true_loss函数的out分别进行了修改,但是运行结果也还是报相同的错误,想请问一下,到底是哪出了问题?