bbfsdev / bbfs

Big brother file system (distributed file system)
14 stars 5 forks source link

Shiray: The CPU resources is near 100% all time and MEM consumption is high #151

Closed yarondbb closed 11 years ago

yarondbb commented 11 years ago

Shiray: It's a memory usage in 5 minutes after starting. The configuration is set to monitor 3 directories with a total of 36555 files only.

The CPU resources is near 100% all time. 100% is a full usage of the single CPU's core.

yarondbb commented 11 years ago

Few cancidates for this ebehavior:

  1. Monitoring and Indexing are working non-stop
  2. For each monitoring event the entire content data structure is cloned twice (when writing to index output queue and when updating the dynamic content data) - concider updaintg only when there is a change. And clone it once - or think of incremental changes.. since this is all done locally)
  3. Networking is exausting the system when xxxK files protocol is done. Concider requesting only xxx number of files copy at any compare event? more candidates ?
bbfsdev commented 11 years ago

Note that Shiray tested this on backup server with the following configuration: Note the 5 minute scan period. We have to recreate the problem first.

# Paths to monitor/backup files from. 
# All path files and recursive sub paths are used.
monitoring_paths:
  - path: /mnt/storage/media/files
    scan_period: 300
    stable_state: 2
  - path: /mnt/storage/media/audio
    scan_period: 300
    stable_state: 2
  - path: /mnt/storage/media/video
    scan_period: 300
    stable_state: 2

# Backup folder
backup_destination_folder: '/mnt/storage/bbfs/backup'

# Default is nil
content_server_hostname: 'localhost'

# TCP/IP port to start server to send files to backup
content_server_files_port: 4444

# TCP/IP port to start server to send content data upon request from client
content_server_data_port: 3333

# Delay before trying to reconnect
client_retry_delay: 300

# backup data file, i.e. state(index) of backup server destination folder
local_content_data_path: '~/.bbfs/var/backup.data'

# Write all log messages to file
log_write_to_file: true
log_file_name: '~/.bbfs/log/backup_server.log4r'

# Write all log messages to standard output (default: false)
log_write_to_console: false

# Send errors to mail on system crush only (default: false)
log_write_to_email: false

# Verbosity of logging
# 0 will log INFO messages. 3 will print ALL debug messages as well.
log_debug_level: 0

# user should change to real value
from_email: 'jhon.doe@gmail.com'

# user should change to real value
from_email_password: 'hihahu'

# user should change to real value
to_email: 'jhon.doe@gmail.com'

# Log of file changes
default_monitoring_log_path: '~/.bbfs/log/backup_file_monitoring.log'

# Cycles of fetch period used by backup server to ping content server for its content
remote_content_save_timeout: 900

# Cycles where backup is checking if sync is required between remote and backup contents
backup_check_delay: 300

# Write all parameters to console on start
print_params_to_stdout: false
bbfsdev commented 11 years ago

https://github.com/ruby-prof/ruby-prof

bbfsdev commented 11 years ago

http://ruby-prof.rubyforge.org/

bbfsdev commented 11 years ago

Don't forget to generate tmux so everyone can see your work.

bbfsdev commented 11 years ago

The critical part of this is the MEM consumption. CPU consumption is not critical.

bbfsdev commented 11 years ago

@yarondbb Please concentrate on MEM consumption and on the prof tool. But first try to recreate the MEM problem with v1.0.2

yarondbb commented 11 years ago

analysis Report:

  1. backup server was run with the above configuration for 5 minuts and more (36k files with average size of 2K) No memory inflate seen
  2. backup server was run with 80 files of average of 250M size No memory inflate seen

Maybe file distributation is different in Shiray's system Shiray, Pls show us the results of: du /mnt/storage/media

yarondbb commented 11 years ago

@kolmanv I can connect to sinatra via localhost only (on my Win machine), but I can not connect to the linux server.. any idea why? did it work before?

yarondbb commented 11 years ago

Ok problem is fixed (adding: set :bind, '0.0.0.0' which should be the degault for any incoming connections)

yarondbb commented 11 years ago

Deployed at patch_1_0_3 Blocked by inputs from @vshiray Moving assigny to @kolmanv

bbfsdev commented 11 years ago

VERY GOOD WORK GUYS :) :+1: @vshiray @yarondbb @AlexeyNemytov