Closed Changliu52 closed 5 years ago
Dear Chang,
I have two ideas: 1) There are orphaned shared mem files in /dev/shm/ Close all GPU-Voxels executables and delete all SharedMem files in there.
2) Shared mem is not supported on ARM My experiments with the TK1 board are already 2 years ago, but at that point in time, the shared memory functions from CUDA were not available on the ARM platforms.
There are examples shipped with CUDA that cover shared memory: http://docs.nvidia.com/cuda/cuda-samples/index.html#simpleipc Perhaps you can compile them and check, if the feature is supported by now.
Critical API call: cudaIpcGetMemHandle(&my_handle, data);
Please report your findings.
Cheers, Andreas
I see. Thank you. I found that TX2 arm still doesn't support IPC api call... Couldn't compile the simpleIPC.cu compilation output:
>>> WARNING - simpleIPC is not supported on aarch64 - waiving sample <<<
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o simpleIPC.o -c simpleIPC.cu
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o simpleIPC simpleIPC.o
[@] mkdir -p ../../bin/aarch64/linux/release
[@] cp simpleIPC ../../bin/aarch64/linux/release
Any suggestion on potential ways to visualise it without IPC calls?
this link seems relevant ? tx2 seems already sharing the memory between GPU and arm.
You are right: As the embedded devices do share the same physical memory, it should be possible to accomplish the visualization. Your link contains another link to Zero-Copy: http://arrayfire.com/zero-copy-on-tegra-k1/ which looks promising. But replacing the cudaIpcGetMemHandle() functions by cudaHostGetDevicePointer() is perhaps not that easy. If I understand the docs right, you also have to allocate the memory by a special allocator. And also then, I am not sure, if the OS lets you just pass and access the pointers between two individual processes. Perhaps you want to try and find out?
Another possibility would be to create a single executable which contains the visualizer... So that no IPC is required at all. The required changes should not be too big.
Off-topic: The ArrayFire report makes clear, that on a device with unified physical memory, the performance of GPU-Voxels could be increased by using Zero-Copy in general.
Thank you for your valuable suggestions.
I agree the zerocopy is the perfect solution to that. However considering the complexity of the changes I am only using it to various the planning. So probably creating the single executable containing visualiser will do the job.
I will need your further advice to create that. Shall I just use Cudamemcpy?
Chang
Hi Chang, no, definitely no memcopy. That's the whole point of the SharedMemory visualization (: So currently we have CUDA results in the GPU Mem, that we want to visualize. Therefore we generate a mem-handle to that data in the main program and write that into a host-shared-mem segment. Then the visualizer program reads that handle from the host-shared-mem and uses it, to get access to the CUDA data to wrap it into OpenGL calls.
So all you need to do is to bypass the host-shared-mem and give the visualizer the pointer to the CUDA memory directly. This can work if both is running in the same program. So still "shared memory" in terms of data creation and visualization, but no more shared-mem between two separate programs.
The code changes should be quite easy, as you can do it in clearly defined spots: Namely in the interface classes between provide and visualizer: https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/tree/master/packages/gpu_voxels/src/gpu_voxels/vis_interface and https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/blob/master/packages/gpu_voxels/src/gpu_visualization/SharedMemoryManager.h or the other SharedMemoryManager* files.
If you do this, please commit it to some branch here at GitHub, so that I can eventually help you, but also that we can merge it to upstream for the other TX2 users.
Andreas
For the moment the Visualizer does not work on the Jetson TX2, as it does not support the Shared Memory IPC functionality used by the Visualizer.
We are thinking about expanding mobile platform support in the future. Meanwhile, if someone else wants to try Andreas' proposed non-shared-memory single-process approach, I would be happy to hear about it :-)
Thank you so much guys for the clarifications. I certainly want to try it. Here is my TX2 version: https://github.com/Changliu52/gpu-voxels I am not really familiar with cuda. How can I get the pointer from CUDA memory? Maybe you can give me one example for changing it in vis_interface and SharedMemoryManager.h, and I can do the rest bits.
Hello all,
I was trying to make the serialization of the data in the JetsonTX2 and deserialize it on a X86 platform to visualize it there, as @Squelsh recommend in the issue #79
The problem I have is that I can't find a way to take the voxel data (the maps) from the memory to serialize it. Is there any option already developed in the repository to do it, or does some one has any recommendation on how to do it.
Thanks in advanced
Hi Alejandro,
I assume you are using VoxelMaps?
For voxelmaps the process could involve getDeviceDataPtr and cudaMemcpy: https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/blob/master/packages/gpu_voxels/src/gpu_voxels/voxelmap/TemplateVoxelMap.h#L69
We did add a more convenient "copy device data to host" functionality to TemplateVoxelList, using thrust::copy: https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/blob/master/packages/gpu_voxels/src/gpu_voxels/voxellist/TemplateVoxelList.hpp#L289
Is this helpful?
Hi Christian,
Yes, I am using VoxelMaps. Where do I get Voxel* dev_data
to call the constructor?
I am not not sure on how to proceed with cudaMemcpy and then introduce the data in the structure of ProtoBuf. Any idea?
Hi Alejandro,
does this code example help you? It shows how to get the Voxel* on the serialization side. https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/issues/57#issuecomment-333923749
You can then cudaMemcpy the data to host memory, transfer it to your target system, cudaMemcpy the data into device memory and pass the pointer to the VoxelMap constructor.
I will keep this issue open until we have a solution to visualize voxel data on Jetson platforms.
My currently preferred solution for this is to publish the data in ROS sensor_msgs::PointCloud format and then to render that point-cloud as cubes in RViz:
Is this feature already in this package?
@Changliu52 Do you mean the publishing as a ROS/PCL PointCloud as in that image?
It's actually just a use of TemplateVoxelList::copyCoordsToHost and then creating a pcl::PointXYZ for each voxel, assigning the center of the point in real world coordinates based on voxel_side_length and the voxel coordinates. The next bugfix release will contain an example with the code for this. The color and shape of the voxels is set in the RViz PointCloud settings.
On copyCoordsToHost:
Great. Christain. Thank you very much for sharing the insights. Very useful.
Kind Regards,
Chang
On 21 Mar 2019, at 01:02, Christian Jülg notifications@github.com wrote:
@Changliu52 Do you mean the publishing as a ROS/PCL PointCloud as in that image?
It's actually just a use of TemplateVoxelList::copyCoordsToHost and then creating a pcl::PointXYZ for each voxel, assigning the center of the point in real world coordinates based on voxel_side_length and the voxel coordinates. The next bugfix release will contain an example with the code for this. The color and shape of the voxels is set in the RViz PointCloud settings.
On copyCoordsToHost:
https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/blob/master/packages/gpu_voxels/src/gpu_voxels/voxellist/TemplateVoxelList.h#L118 https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/blob/master/packages/gpu_voxels/src/gpu_voxels/test/testing_voxellist.cu#L1094 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@Changliu52 Do you mean the publishing as a ROS/PCL PointCloud as in that image?
It's actually just a use of TemplateVoxelList::copyCoordsToHost and then creating a pcl::PointXYZ for each voxel, assigning the center of the point in real world coordinates based on voxel_side_length and the voxel coordinates. The next bugfix release will contain an example with the code for this. The color and shape of the voxels is set in the RViz PointCloud settings.
On copyCoordsToHost:
Hello,
Any news about the new bugfix release? I am also working with Jetson and it would save me some time if there is an example of ROS Pointcloud publisher.
Best regards, Mehmet
Hi Mehmet,
this snippet uses TemplateVoxelList::copyCoordsToHost to copy the data and creates pcl::PointXYZ objects from the resulting Vector3ui objects:
/**
* Publish the voxels of a VoxelList as a pointcloud in ROS to demonstrate ROS interoperability.
*
* In RViz: add the pointcloud, set render settings to Cube, scale to voxel_side_length and alpha to 0.5
*/
template <class VoxelList>
void publish_pointcloud(shared_ptr<VoxelList>& voxel_list_ptr, ros::Publisher& publisher)
{
double voxel_offset = voxel_side_length * 0.5f;
std::vector<gpu_voxels::Vector3ui> voxels;
voxel_list_ptr->copyCoordsToHost(voxels);
PclPointCloud::Ptr pointcloud_msg(new PclPointCloud); // shared pointer captures new
pointcloud_msg->header.frame_id = voxel_point_cloud_frame; // use same reference as in transform_...
pcl_conversions::toPCL(ros::Time::now(), pointcloud_msg->header.stamp);
pointcloud_msg->height = 1;
pointcloud_msg->width = voxels.size();
for(size_t i=0; i < voxels.size(); i++)
{
Vector3ui voxel = voxels[i];
PointXYZ voxel_point;
voxel_point.x = voxel.x * voxel_side_length + voxel_offset;
voxel_point.y = voxel.y * voxel_side_length + voxel_offset;
voxel_point.z = voxel.z * voxel_side_length + voxel_offset;
pointcloud_msg->points.push_back(voxel_point);
}
publisher.publish(pointcloud_msg);
}
You can use TemplateVoxelList::copyCoordsToHost and then create a pcl::PointXYZ for each voxel.
The color and shape of the voxels is set in the RViz PointCloud settings.
copyCoordsToHost
code:
Hi Mehmet,
this snippet uses TemplateVoxelList::copyCoordsToHost to copy the data and creates pcl::PointXYZ objects from the resulting Vector3ui objects:
/** * Publish the voxels of a VoxelList as a pointcloud in ROS to demonstrate ROS interoperability. * * In RViz: add the pointcloud, set render settings to Cube, scale to voxel_side_length and alpha to 0.5 */ template <class VoxelList> void publish_pointcloud(shared_ptr<VoxelList>& voxel_list_ptr, ros::Publisher& publisher) { double voxel_offset = voxel_side_length * 0.5f; std::vector<gpu_voxels::Vector3ui> voxels; voxel_list_ptr->copyCoordsToHost(voxels); PclPointCloud::Ptr pointcloud_msg(new PclPointCloud); // shared pointer captures new pointcloud_msg->header.frame_id = voxel_point_cloud_frame; // use same reference as in transform_... pcl_conversions::toPCL(ros::Time::now(), pointcloud_msg->header.stamp); pointcloud_msg->height = 1; pointcloud_msg->width = voxels.size(); for(size_t i=0; i < voxels.size(); i++) { Vector3ui voxel = voxels[i]; PointXYZ voxel_point; voxel_point.x = voxel.x * voxel_side_length + voxel_offset; voxel_point.y = voxel.y * voxel_side_length + voxel_offset; voxel_point.z = voxel.z * voxel_side_length + voxel_offset; pointcloud_msg->points.push_back(voxel_point); } publisher.publish(pointcloud_msg); }
You can use TemplateVoxelList::copyCoordsToHost and then create a pcl::PointXYZ for each voxel.
The color and shape of the voxels is set in the RViz PointCloud settings.
copyCoordsToHost
code:
witch demo contains this code?
The snippet I posted is not currently in any of the examples, I should fix that.
I use that snippet in so many of my applications that I just assumed it would already be in there.
The snipped I posted is not currently in any of the examples, I should fix that.
I use that snipped in so many of my applications that I just assumed it would already be in there.
if there is an example which contains that , read pcd file and turn it to volex list ,and then publish it to ros rviz, maybe that's great.
t's actually just a use of TemplateVoxelList::copyCoordsToHost and then creating a pcl::PointXYZ for each voxel, assigning the center of the point in real world coordinates based on voxel_side_length and the voxel coordinates. The next bugfix release will contain an example with the code for this. The color and shape of the voxels is set in the RViz PointCloud settings.
On copyCoordsToHost:
did we get an example for doing so?
The next bugfix release will contain an example with the code for this
Hi @AndreV84, this function shows all the important steps to publish a voxel list as a ROS pointcloud: https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/issues/65#issuecomment-541816538
Do you require a more complete example?
Hi @AndreV84, this function shows all the important steps to publish a voxel list as a ROS pointcloud: #65 (comment)
Do you require a more complete example? @cjue Thank you for your response. we do not know how to apply the snippet. could you elaborate, please? does it need to be incorporated into source code of one of samples before building the gpu_voxels? In which exactly? in ros_distance? after which line? How to run the whole thing after adding it to some file then rebuilding the gpu_voxels? Thank you very much!
@cjue Thank you for your response. we do not know how to apply the snippet. could you elaborate, please? does it need to be incorporated into source code of one of samples before building the gpu_voxels? In which exactly? in ros_distance? after which line? How to run the whole thing after adding it to some file then rebuilding the gpu_voxels? Thank you very much!
Yes, the idea is to put this e.g. into the DistanceROSDemo as an alternative way of visualizing a voxellist. The following snippet shows how the publish_pointcloud function could be used in a program like DistanceROSDemo.
I will publish a full working version in two weeks after my vacation. Our full internal example has some additional dependencies that are also not yet in the open source version, and therefore require some additional work by me to get running.
ros::Publisher voxel_pointcloud_pub;
std::string voxel_pointcloud_topic("voxel_pointcloud");
/**
* Publish the voxels of a VoxelList as a pointcloud in ROS to demonstrate ROS interoperability.
*
* In RViz: add the pointcloud, set render settings to Cube, scale to voxel_side_length and alpha to 0.5
*/
template <class VoxelList>
void publish_pointcloud(shared_ptr<VoxelList>& voxel_list_ptr, ros::Publisher& publisher)
{
double voxel_offset = voxel_side_length * 0.5f;
std::vector<gpu_voxels::Vector3ui> voxels;
voxel_list_ptr->copyCoordsToHost(voxels);
PclPointCloud::Ptr pointcloud_msg(new PclPointCloud); // shared pointer captures new
pointcloud_msg->header.frame_id = voxel_point_cloud_frame; // use same reference as in transform_...
pcl_conversions::toPCL(ros::Time::now(), pointcloud_msg->header.stamp);
pointcloud_msg->height = 1;
pointcloud_msg->width = voxels.size();
for(size_t i=0; i < voxels.size(); i++)
{
Vector3ui voxel = voxels[i];
PointXYZ voxel_point;
voxel_point.x = voxel.x * voxel_side_length + voxel_offset;
voxel_point.y = voxel.y * voxel_side_length + voxel_offset;
voxel_point.z = voxel.z * voxel_side_length + voxel_offset;
pointcloud_msg->points.push_back(voxel_point);
}
publisher.publish(pointcloud_msg);
}
void doSomethingWithVoxellists() {
// do some setup for two voxellists
// CountingVoxelLists for density filtering the input cloud
gvl->addMap(MT_COUNTING_VOXELLIST, "countingVoxelList");
shared_ptr<CountingVoxelList> countingVoxelList = dynamic_pointer_cast<CountingVoxelList>(gvl->getMap("countingVoxelList"));
gvl->addMap(MT_COUNTING_VOXELLIST, "countingVoxelListFiltered");
shared_ptr<CountingVoxelList> countingVoxelListFiltered = dynamic_pointer_cast<CountingVoxelList>(gvl->getMap("countingVoxelListFiltered"));
int filter_threshold = 2;
while (ros::ok())
{
// get new data, e.g. through a ROS topic
// ...
//filter input point cloud by per voxel threshold
countingVoxelListFiltered->merge(countingVoxelList);
countingVoxelListFiltered->remove_underpopulated(filter_threshold);
gvl->visualizeMap("countingVoxelListFiltered");
// publish the filtered voxellist for RViz to show ROS interopability
// In RViz: add the pointcloud, set render settings to Cube, scale to voxel_side_length and alpha to 0.5
publish_pointcloud(countingVoxelListFiltered, voxel_pointcloud_pub);
}
}
After changing something in the example folder you need to rebuild, but it should be very quick as only the example program will be compiled, not the full library.
Thank you for the update!
@cjue > I will publish a full working version in two weeks after my vacation. looking forward for the full example I tried to undrstand which exactly lines of the file DistanceROSDemo.cpp update using the above snippet, but have had difficult times understanding exact modifications to the code Thank you very much!
Hi, I have compiled everything in TX2 without pcl following the steps:
I can then run the gvl_ompl_planner successfully, however, when I launch the gpu_voxels_visualizer, I got the following error with an empty world:
Please shine some lights if there were anything I did wrong?
Thank you in advance. Chang