jsk-ros-pkg / jsk_robot

jsk-ros-pkg/jsk_robot
https://github.com/jsk-ros-pkg/jsk_robot
73 stars 97 forks source link

Pepper stores compressed image on mongo #1714

Open knorth55 opened 1 year ago

knorth55 commented 1 year ago

@k-okada pepper stores compressed image on mongodb in musca, which will cause slow query soon. This was the reason why I gave up using mongo at the past. If you want to enable compressed image for all robot, we need a better computer with better storage. Also, we need to run Mongo backup more frequently, which i have done manually. what is your plan?

knorth55 commented 1 year ago

i think for mongo, it is better to store data periodically, like smach execution because mongo is slow and heavy. now in smach status local_data, we have image data, so we dont need to store the image all the time if you want to use smach execution image.

honestly, the mongo server will be slow and heavy soon, and i dont want to maintain the musca mongo server so often.

cc. @tkmtnt7000

k-okada commented 1 year ago

Also, we need to run Mongo backup more frequently, which i have done manually.

Oh, I didn't know that. Please explain how to backup mongo data, I am happy to maintain that. Beacause this is an essense of our furutre research direction and also I'd like to show how to utilize stored data in next demonstaration (assiged: @tkmtnt7000, @mqcmd196 ) You can also remove current pepper/basil data, if there are already some problem. It is just a testing.

i think for mongo, it is better to store data periodically

periodically? on an as-needed basis ?? https://github.com/jsk-ros-pkg/jsk_recognition/pull/2736 stores data only when data has changed.

now in smach status local_data, we have image data, so we dont need to store the image all the time if you want to use smach execution image.

I'd like to build memory of robot when @tkmtnt7000 and @a-ichikura talking each other and show some object based on their conversation in front of the robot. May be we can store good image sequence if we build smach which the state changes in very fast loop, but keep images other than smach state also make sense, I think.

honestly, the mongo server will be slow and heavy soon, and i dont want to maintain the musca mongo server so often.

I agree, thus why I am waiting for https://github.com/jsk-ros-pkg/jsk_3rdparty/issues/372 and https://github.com/jsk-ros-pkg/jsk_robot/pull/1574.

But I understand that moving to new system is not as easy as we thought in the beginning, and I always beleive "worse is always better than worst".

I think one reason of heavy db is we store all data in to one collection, if this is true, one idea is to separate collection based on episode.

knorth55 commented 1 year ago

Oh, I didn't know that. Please explain how to backup mongo data, I am happy to maintain that

We have QNAP (called yokan) backup for musca mongodb, and I set cronjob to run rsync to qnap. https://github.com/knorth55/jsk_database/blob/main/jsk_database_scripts/mongodb/backup_to_qnap.sh

$ sudo crontab -l
0 5 * * SUN /bin/bash /home/furushchev/Development/jsk_database/jsk_database_scripts/mongodb/backup_to_qnap.sh >> /var/log/backup_to_qnap.log
0 4 * * SUN /sbin/shutdown -r now

And when the hard disk of musca is full, I stop mongodb, run rsync for backing up to qnap and remove the mongodb's directory manually. This is what I said maintainance. I also do the same thing for influxdb, too. (Influxdb is running on my computer)

periodically? on an as-needed basis ?? https://github.com/jsk-ros-pkg/jsk_recognition/pull/2736 stores data only when data has changed.

as-needed basis, I mean. But well, let's check https://github.com/jsk-ros-pkg/jsk_recognition/pull/2736 works.

I'd like to build memory of robot when @tkmtnt7000 and @a-ichikura talking each other and show some object based on their conversation in front of the robot. May be we can store good image sequence if we build smach which the state changes in very fast loop, but keep images other than smach state also make sense, I think.

Hmmm, I want to store the image data in directory and save the path... If the directories locations is google drive, it is much better. I think I need to implement the google drive recording system for mongodb, too.

I agree, thus why I am waiting for https://github.com/jsk-ros-pkg/jsk_3rdparty/issues/372 and https://github.com/jsk-ros-pkg/jsk_robot/pull/1574.

OK, I will implement smach logger soon.

I think one reason of heavy db is we store all data in to one collection, if this is true, one idea is to separate collection based on episode.

thats true. Now we put all the data in the same jsk_robot_lifelog collection, which causes slow query. collection name should be separated by the robot name. I also want to split the image data and other data, because we can drop the image data easily when the database is heavy.

knorth55 commented 1 year ago

For compressed depth, I made zdepth_image_transport. https://github.com/jsk-ros-pkg/jsk_3rdparty/pull/389

the big issue for the compressed depth is that the png compression is too slow, and the image_transport runs in serial, when we subscribe compressedDepth, all the tdepth opics will be slow. I implement zdepth_image_transport, and with this, depth seems not slowing down :) Related issue: https://github.com/IntelRealSense/realsense-ros/issues/1672 https://github.com/IntelRealSense/realsense-ros/issues/369

knorth55 commented 1 year ago

I discussed with @tkmtnt7000 , and we first try to store the Fetch Kitchen demo in different collection. In order to do that, we first launch common_logger.launch (for different collection) in Fetch Kitchen demo. We also need to avoid name space collision by changin the namespace.

@tkmtnt7000 is now making common_logger.launch to support different collection name and avoid name crash with default common_logger.launch.

knorth55 commented 1 year ago

We need to change

We also implement

cc. @tkmtnt7000

k-okada commented 1 year ago

@knorth55 @tkmtnt7000 I try to add lifelog setting to spot robot (https://github.com/jsk-ros-pkg/jsk_robot/pull/1701#issuecomment-1308231308) and what do we need to add/imprement topics to be sotred.

Just to use mong_record.py is not enough?

    <node if="$(arg speech_to_text)"
          name="app_logger"
          pkg="jsk_robot_startup" type="mongo_record.py"
          machine="$(arg machine)"
          respawn="$(arg respawn)">
      <rosparam subst_value="true">                                                                                                        
        subst_param: true                                                                                                                  
        topics:                                                                                                                            
        - /Tablet/voice                                                                                        
      </rosparam>
    </node>
knorth55 commented 1 year ago

yes you only need to add several nodes like that!

k-okada commented 1 year ago

speech_logger: -> done??

https://github.com/jsk-ros-pkg/jsk_robot/blob/f74dd986bcdb29f8401ca9313cb15964f0244bb2/jsk_spot_robot/jsk_spot_startup/launch/include/lifelog.launch#L85-L88

c.f. https://github.com/jsk-ros-pkg/jsk_robot/pull/1701

knorth55 commented 1 year ago

speech_logger: -> done??

https://github.com/jsk-ros-pkg/jsk_robot/pull/1701 is logger for what robot speaks. I meant logger for what people speak. We just to record /speech_to_text/output published like this.

https://github.com/jsk-ros-pkg/jsk_robot/blob/ebf43f0c7a5f2df43bf81be6290b08cdbd7594d9/jsk_fetch_robot/jsk_fetch_startup/launch/fetch_bringup.launch#L143-L181

tkmtnt7000 commented 1 year ago

I have found that it is difficult to extract compressed image data from collection: go_to_kitchen. go_to_kitchen collection has about 30GB data.

db.go_to_kitchen.find({"_meta.stored_type": "sensor_msgs/CompressedImage"}).sort({"_meta.timestamp": -1}).limit(1)

Probably we should use video_to_scene.launch or something instead.