Open AnaRhisT94 opened 4 years ago
1: Yes, we used images from left camera and right camera.
I see, thank you for the answers! @JiaxiongQ I'll update later with my progress and write a full step-by-step on how to do it for people who are confused in the beginning like me.
@JiaxiongQ
For training with Synthetic data:
I use RGBRight
and RGBLeft
folders,
Sparse Lidar dataset in the folder lidar
And finally, the ground-truth normals from dense depth lidar I take from the folder Normal_m
right?
Question 1: Is that true that the folders above are used for training?
Question 2: If yes, there's only sparse lidar
for RGBLeft
and Normals_m
for RGBLeft
, why do we use RGBRight
?
Yes, because we only generated surface normal from the depth of left camera.
Yes, because we only generated surface normal from the depth of left camera.
Thank you!
Hi @JiaxiongQ ,
After I prepared the 3 folders: RGBLeft
, lidar
and Normals_m
from /Town11/SEQ
(to test that the training work), I'm getting the following error:
File "/home/unknown/depth_est/DeepLiDAR/submodels/depthCompleNew.py", line 155, in forward
inputS = torch.cat((sparse,mask),1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 512 and 256 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71
sparse.shape
(4, 1, 256, 512)
mask.shape
(4, 256, 512, 1)
I probably need to change the 1
in mask
to be after 4
and it will be fixed, I'll try that out and update. But why doesn't it work out of the box? I didn't see any posts about this when training the first NN, did I do something wrong in the process?
EDIT: When changing the shape with np.transpose
so that mask
will have (4,1,256,512)
, it gives me a new errors, also other errors happen if I change sprarse
instead.. any ideas how to solve this? I'm out of ideas, also didn't see anyone here saying they got this error when training. I double and triple checked my paths and the images and len
of images (495 images) is for every of the 3 folders, so the data itself should be fine.
In our 'dataloader/trainLoaderN.py', there is: So you should not need to do 'np.transpose'. And sparse did the same operation, their shapes should be matched.
I see, but that still doesn't work, I attached an image of the variables before exiting __getitem__
function in trainLoaderN.py
:
Same error: (Haven't changed anything in the code except loading the images in nomalLoader.py
:
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 512 and 256 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71
Also, searching for a solution for this problem, they suggested to make batch_size = 1
, which didn't help in:
TrainImgLoader = torch.utils.data.DataLoader(
DA.myImageFloder(all_left_img,all_normal,all_gts ,True, args.model),
batch_size = 1, shuffle=True, num_workers=1, drop_last=True)
Also in trainN.py
: I printed the shapes before loss=train(...)
for batch_idx, (imgL_crop,sparse_n,mask,mask1,data_in1) in enumerate(TrainImgLoader):
start_time = time.time()
print(imgL_crop.shape)
print(sparse_n.shape)
print(mask1.shape)
print(mask1.shape)
print(data_in1.shape)
Output:
(1, 3, 256, 512)
(1, 1, 256, 512)
(1, 256, 512, 3)
(1, 256, 512, 3)
(1, 256, 512, 3)
I really want to get this to work and I've no idea why this doesn't..
Sorry, I don't know why this would happen, but you can use ‘torch.permute()’ to change the dimension and make all the dimension of inputs like (b, c, 256, 512).
Sorry, I don't know why this would happen, but you can use ‘torch.permute()’ to change the dimension and make all the dimension of inputs like (b, c, 256, 512).
Hi @JiaxiongQ ,
I'll try with torch.permute()’
soon, other than that,
I'm out of ideas, any chances you could help me out with this any further?
Here's the code to prepare the 3
folders from Town11/SEQ0
:
def dataloader_synthetic(filepath):
imagesl = []
normalS = []
normal_gts = []
temp = filepath
filepathl = temp + 'Town11/SEQ0' #RGB dataset folder, Left and Right
filepathgt = filepathl + '/Normal_m'
#seqs = [seq for seq in os.listdir(filepathl) if seq.find('sync') > -1]
left_fold = '/RGBLeft'
right_fold = '/RGBright'
lidar_foldl ='/lidar'
#lidar_foldr = '/proj_depth/velodyne_raw/image_03'
#for seq in seqs:
left_path = filepathl + left_fold
right_path= filepathl + right_fold
lc= [os.path.join(left_path, img) for img in os.listdir(left_path)]
lc.sort()
#lc=lc[5:-5]
rc= [os.path.join(right_path, img) for img in os.listdir(right_path)]
rc.sort()
#rc=rc[5:-5]
imagesl = np.append(imagesl, lc)
#imagesl = np.append(imagesl, rc)
gt_path = filepathgt
lids2l = filepathl
lidar2l = [os.path.join(lids2l + lidar_foldl,lid) for lid in os.listdir(lids2l + lidar_foldl)]
lidar2l.sort()
normalS = np.append(normalS, lidar2l)
#lids2r = os.path.join(filepathl, seq) + lidar_foldr
#lidar2r = [os.path.join(lids2r, lid) for lid in os.listdir(temp)]
#lidar2r.sort()
#normalS = np.append(normalS, lidar2r)
gt_imgs = [os.path.join(gt_path, norm) for norm in os.listdir(gt_path)]
gt_imgs.sort()
normal_gts= np.append(normal_gts, gt_imgs)
#normal_gts= np.append(normal_gts, gt_imgs)
left_train = imagesl
normalS_train = normalS
return left_train,normalS_train,normal_gts
Didn't change anything else.
After using torch.permute()
, it doesn't shoot this error now, but there's a new error in the function: nomal_loss
: (there's a torch inside that tuple), so I guess it needs to be converted to torch, or not permuted at all, I'm not sure why I'm getting all these errors and no one else posted any of these errors here.
pred_n = pred.permute(0,2,3,1)
AttributeError: 'tuple' object has no attribute 'permute'
Full code of that function:
def nomal_loss(pred, targetN,mask1):
valid_mask = (mask1 > 0.0).detach()
print(type(pred))
print(pred)
pred_n = pred.permute(0,2,3,1)
pred_n = pred_n[valid_mask]
target_n = targetN[valid_mask]
pred_n = pred_n.contiguous().view(-1,3)
pred_n = F.normalize(pred_n)
target_n = target_n.contiguous().view(-1, 3)
loss_function = nn.CosineEmbeddingLoss()
loss = loss_function(pred_n, target_n, Variable(torch.Tensor(pred_n.size(0)).cuda().fill_(1.0)))
return loss
Now changed to
pred_n = pred[0]
and new error:
pred_n = pred_n[valid_mask]
IndexError: The shape of the mask [1, 3, 256, 512] at index 1does not match the shape of the indexed tensor [1, 2, 256, 512] at index 1
This code is mainly for KITTI,you should modify it and just insure the file names can be matched.
This code is mainly for KITTI,you should modify it and just insure the file names can be matched.
Hi @JiaxiongQ , Yes, I did, I modified it to work with the 3 folders with the synthetic, and it still doesn't work. (you can see most of it is commented out, and i renamed the function name)
Regarding Q2, the raw KITTI data overview page provides a raw_data_downloader.sh
script to download and extract all of the raw data zip files. A slightly modified version with cleaner status info output can be found here: https://gist.github.com/valgur/cb9da4d1370ccc13c7c6b7c8c632d3e2
Hi @JiaxiongQ , After I prepared the 3 folders:
RGBLeft
,lidar
andNormals_m
from/Town11/SEQ
(to test that the training work), I'm getting the following error:File "/home/unknown/depth_est/DeepLiDAR/submodels/depthCompleNew.py", line 155, in forward inputS = torch.cat((sparse,mask),1) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 512 and 256 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71
sparse.shape
(4, 1, 256, 512)
mask.shape(4, 256, 512, 1)
I probably need to change the
1
inmask
to be after4
and it will be fixed, I'll try that out and update. But why doesn't it work out of the box? I didn't see any posts about this when training the first NN, did I do something wrong in the process?EDIT: When changing the shape with
np.transpose
so thatmask
will have(4,1,256,512)
, it gives me a new errors, also other errors happen if I changesprarse
instead.. any ideas how to solve this? I'm out of ideas, also didn't see anyone here saying they got this error when training. I double and triple checked my paths and the images andlen
of images (495 images) is for every of the 3 folders, so the data itself should be fine.
I meet the same problem about dimension mismatch, I have change a lot to fit it on sythtic dataset, You could debug the program step by step to change the dimension order to fix it. the recommend dimension order of PyTorch is (B, C, H, W).
I have changed the order like this :
inputl = inputl.cuda()# .permute(0,2,3,1)
sparse = sparse.cuda()# permute(0,2,3,1)
gt1 = gt1.cuda().permute(0,3,1,2)
mask1 = mask1.cuda().permute(0,3,1,2)
mask = mask.cuda().permute(0,3,1,2)
May it help you
Q1:Yes, we use images from the left camera and the right camera; Q2:We don't find this link, we just download the dataset one by one; Q3:It is flexible to organize files, you just need to make sure that all the images are corresponding; Q4:No, you just need to use the one DCU to train surface normal. The synthetic data is used to improve the quality of surface normal, the download link is in README.md; Q5:The whole training process takes 15 epochs on 3 GPUs(1080 ti).
On Thu, Apr 9, 2020 at 9:24 AM graycrown notifications@github.com wrote:
Hi, thanks for this amazing repo. @JiaxiongQ https://github.com/JiaxiongQ
I'm trying to get trainN.py and nomalLoader.py to work in order to train the first NN. This is what I understood so far that I need in order to train:
- Download data_depth_velodyne which is the sparse Lidar dataset.
- Download data_depth_annotated which is the ground-truth (Dense) Lidar dataset.
- Use the second repo. in order to generate from the ground-truth Dense Lidar dataset the ground-truth normals.
- Download ALL the RGB Kitti images from all the categories ( City | Residential | Road | Campus | Person | Calibration ), Is there a link to download all at once instead of downloading one by one?
Question 1: Do I need to extract all the RGB Images into the folders one by one into data_depth_velodyne/train/..*sync/ - I need to add image_02 and image_03 folders to each of the sync folders? (This is implied from your code)
Question 2: Is there a way to download all the RGB Images in one-shot instead of clicking one by one and extracting them one by one to all the folders?
In nomalLoader.py the function dataloader(filepath) returns 3 variables: left_train,normalS_train,normal_gts which are: a. left_train - the RGB Kitty Image folders 'data_depth_velodyne/train/..sync/image02 & 03/data. b. normalS_train - - the Sparse lidar folders 'data_depth_velodyne/train/..sync/proj_depth/velodyne_raw/image02 & 03/. c. normal_gts is the folder which has all the normals I generated from dense gt: data_depth_annotated/_sync/proj_depth/groundtruth/image_02 & image_03 -> gt/out/train/_sync/image_02 & image_03 or should it be all in gt/out/train/*_sync/? Because in the code there isn't anything about concatinating the image_02 & image_03.
Question 3: please look at c., I asked there about the ground-truth normals.
Question 4: When and where the synthetic data is used? Do we use it also in trainN.py? Do we use it in all the 3 NNs?
Question 5: How many epochs is recommended to train on? Other than that, thank you. It took me so many hours just to get to the point I understand how to get the data ready (and still trying), I'll definitely add a guide on how to prepare the data to train after this post, so others can save many hours to understand the process.
About Q2, You can find it in tool sets in KITTI homepage, someone support a script to download them all.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JiaxiongQ/DeepLiDAR/issues/22#issuecomment-611276326, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJANJRATRINBRO3ODJPHKOTRLUPWHANCNFSM4KCWAYVA .
Hi, thanks for this amazing repo. @JiaxiongQ
I'm trying to get
trainN.py
andnomalLoader.py
to work in order to train the first NN. This is what I understood so far that I need in order to train:data_depth_velodyne
which is the sparse Lidar dataset.data_depth_annotated
which is the ground-truth (Dense) Lidar dataset.Question 1: Do I need to extract all the RGB Images into the folders one by one into
data_depth_velodyne/train/..*sync/
- I need to addimage_02
andimage_03
folders to each of thesync
folders? (This is implied from your code)Question 2: Is there a way to download all the RGB Images in one-shot instead of clicking one by one and extracting them one by one to all the folders?
In
nomalLoader.py
the functiondataloader(filepath)
returns 3 variables:left_train,normalS_train,normal_gts
which are: a.left_train
- the RGB Kitty Image folders'data_depth_velodyne/train/..*sync/image02 & 03/data
. b.normalS_train
- - the Sparse lidar folders'data_depth_velodyne/train/..*sync/proj_depth/velodyne_raw/image02 & 03/
. c.normal_gts
is the folder which has all the normals I generated from dense gt:data_depth_annotated/*_sync/proj_depth/groundtruth/image_02 & image_03
->gt/out/train/*_sync/image_02 & image_03
or should it be all ingt/out/train/*_sync/
? Because in the code there isn't anything about concatinating theimage_02 & image_03
.Question 3: please look at
c.
, I asked there about the ground-truth normals.Question 4: When and where the synthetic data is used? Do we use it also in
trainN.py
? Do we use it in all the 3 NNs?Question 5: How many epochs is recommended to train on? Other than that, thank you. It took me so many hours just to get to the point I understand how to get the data ready (and still trying), I'll definitely add a guide on how to prepare the data to train after this post, so others can save many hours to understand the process.