facebookarchive / bistro

Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
https://bistro.io
MIT License
1.03k stars 158 forks source link

Docker build not working (FunctionInfo.h missing) #18

Closed menzow closed 6 years ago

menzow commented 7 years ago

Hey,

I just tried compiling bistro through Docker and was not able to complete the process. Somewhere during the build the compiler throws an error that FunctionInfo.h is missing from the fbthrift dependency.

After digging through the container I was not able to recover log file with the error message. It would be great to get some info as to where these messages get stored. Running find /home -type f -name *.log -exec /bin/bash -c "cat {} | grep FunctionInfo" did not return anything useful.

I did manage to find a probable cause for this issue by searching for references to FunctionInfo.h in the source files (/home/*) and working backwards from there. The Makefile.am in the fbthrift repository includes only the transport/core/TransportRoutingHandler.h and transport/core/ThriftProcessor.h headers. My guess is that FunctionInfo.h should also be included here.

https://github.com/facebook/fbthrift/blob/76d376e1b7d0f189708ef6438abb40862be360d5/thrift/lib/cpp2/Makefile.am#L182-L184

menzow commented 7 years ago

Also placed the issue in https://github.com/facebook/fbthrift/issues/229 since it seemed more relevant there.

To initiate the build I used the example given in this issue: https://github.com/facebook/bistro/issues/12#issuecomment-301454554

cd bistro
export os_image=ubuntu:16.04
export gcc_version=5
make_parallelism=2 ./build/fbcode_builder/travis_docker_build.sh

Thanks :)

snarkmaster commented 7 years ago

Try using this for the C++ build, for the next few days: https://github.com/facebook/bistro/commit/044cd9f5985eddc330727e59d16686499a2b0ccf

There are no substantive changes since then, and the hash I linked will build.

Apologies for the breakage, and thanks for reporting it. This week the Thrift build was being migrated from automake to CMake, and we did not manage to keep both builds working. I would expect things to go back to normal early next week.

As far as the getting logs, the README recommends a fancier version of this:

.../travis_docker_build.sh &> build.log & tail -f build.log

In other words, the docker build makes no attempt to record logs automatically, it's kind of up to the user :)

menzow commented 7 years ago

Hey @snarkmaster ,

Thanks for your reply! Checking out #044cd9f indeed resolved the missing header issues. However the next problem is that the tests are failing.

I've tried these initialisation variables:

$ os_image=ubuntu:16.04 gcc_version=5  make_parallelism=10 travis_cache_dir=~/travis_ccache ./build/fbcode_builder/travis_docker_build.sh &> build_at_$(date +'%Y%m%d_%H%M%S').log
$ os_image=ubuntu:14.04 gcc_version=4.9  make_parallelism=10 travis_cache_dir=~/travis_ccache ./build/fbcode_builder/travis_docker_build.sh &> build_at_$(date +'%Y%m%d_%H%M%S').log

You can find a verbose output of the test here: https://gist.github.com/menzow/f60681fad7ed5ca8e66700b85730e81f

---> Running in c73fee51d1e9
Test project /home/bistro/bistro/cmake/Debug
      Start  1: test_sqlite
 1/55 Test  #1: test_sqlite ...........................   Passed    0.04 sec
      Start  2: test_remote_worker_state
 2/55 Test  #2: test_remote_worker_state ..............   Passed    0.00 sec
      Start  3: test_remote_worker
 3/55 Test  #3: test_remote_worker ....................   Passed    0.38 sec
      Start  4: test_remote_workers
 4/55 Test  #4: test_remote_workers ...................   Passed    0.04 sec
      Start  5: test_worker_set_id
 5/55 Test  #5: test_worker_set_id ....................   Passed    0.41 sec
      Start  6: test_worker
 6/55 Test  #6: test_worker ...........................***Exception: Other  0.28 sec
      Start  7: test_symbol_table
 7/55 Test  #7: test_symbol_table .....................   Passed    0.02 sec
      Start  8: test_settings_map
 8/55 Test  #8: test_settings_map .....................   Passed    0.03 sec
      Start  9: test_shuffled_range
 9/55 Test  #9: test_shuffled_range ...................   Passed    0.03 sec
      Start 10: test_log_writer
10/55 Test #10: test_log_writer .......................   Passed    0.05 sec
      Start 11: test_shell
11/55 Test #11: test_shell ............................   Passed    0.06 sec
      Start 12: test_async_read_pipe
12/55 Test #12: test_async_read_pipe ..................   Passed    0.05 sec
      Start 13: test_async_read_pipe_rate_limiter
13/55 Test #13: test_async_read_pipe_rate_limiter .....   Passed    1.07 sec
      Start 14: test_async_subprocess
14/55 Test #14: test_async_subprocess .................   Passed    1.18 sec
      Start 15: test_subprocess_output_with_timeout
15/55 Test #15: test_subprocess_output_with_timeout ...   Passed    6.12 sec
      Start 16: test_task_subprocess_queue
16/55 Test #16: test_task_subprocess_queue ............   Passed   12.40 sec
      Start 17: test_cgroup_setup
17/55 Test #17: test_cgroup_setup .....................   Passed    0.04 sec
      Start 18: test_thrift_monitor
18/55 Test #18: test_thrift_monitor ...................***Exception: Other  0.24 sec
      Start 19: test_node_getter
19/55 Test #19: test_node_getter ......................   Passed    0.06 sec
      Start 20: test_crontab_selector
20/55 Test #20: test_crontab_selector .................   Passed    0.03 sec
      Start 21: test_epoch_crontab_item
21/55 Test #21: test_epoch_crontab_item ...............   Passed    0.03 sec
      Start 22: test_standard_crontab_item
22/55 Test #22: test_standard_crontab_item ............   Passed    0.07 sec
      Start 23: test_date_time
23/55 Test #23: test_date_time ........................   Passed    0.16 sec
      Start 24: test_local_runner
24/55 Test #24: test_local_runner .....................   Passed    0.29 sec
      Start 25: test_benchmark_runner
25/55 Test #25: test_benchmark_runner .................   Passed    0.38 sec
      Start 26: test_remote_runner
26/55 Test #26: test_remote_runner ....................***Exception: Other  0.24 sec
      Start 27: test_kill_orphans
27/55 Test #27: test_kill_orphans .....................   Passed    0.10 sec
      Start 28: test_scheduler
28/55 Test #28: test_scheduler ........................***Exception: Other  0.19 sec
      Start 29: test_job_dependency
29/55 Test #29: test_job_dependency ...................   Passed    0.03 sec
      Start 30: test_long_tail
30/55 Test #30: test_long_tail ........................   Passed    0.01 sec
      Start 31: test_level_for_tasks
31/55 Test #31: test_level_for_tasks ..................   Passed    0.02 sec
      Start 32: test_round_robin
32/55 Test #32: test_round_robin ......................   Passed    0.01 sec
      Start 33: test_ranked_priority
33/55 Test #33: test_ranked_priority ..................   Passed    0.01 sec
      Start 34: test_randomized_priority
34/55 Test #34: test_randomized_priority ..............   Passed    0.04 sec
      Start 35: test_add_time_fetcher
35/55 Test #35: test_add_time_fetcher .................   Passed    0.05 sec
      Start 36: test_range_label_fetcher
36/55 Test #36: test_range_label_fetcher ..............   Passed    0.05 sec
      Start 37: test_manual_fetcher
37/55 Test #37: test_manual_fetcher ...................   Passed    0.04 sec
      Start 38: test_script_fetcher
38/55 Test #38: test_script_fetcher ...................   Passed    0.04 sec
      Start 39: test_nodes
39/55 Test #39: test_nodes ............................   Passed    0.03 sec
      Start 40: test_nodes_utils
40/55 Test #40: test_nodes_utils ......................   Passed    0.02 sec
      Start 41: test_thrift_conversion
41/55 Test #41: test_thrift_conversion ................   Passed    0.05 sec
      Start 42: test_job_filters
42/55 Test #42: test_job_filters ......................   Passed    0.07 sec
      Start 43: test_job
43/55 Test #43: test_job ..............................   Passed    0.05 sec
      Start 44: test_node
44/55 Test #44: test_node .............................   Passed    0.02 sec
      Start 45: test_config
45/55 Test #45: test_config ...........................   Passed    0.05 sec
      Start 46: test_backoff
46/55 Test #46: test_backoff ..........................   Passed    0.04 sec
      Start 47: test_file_config_loader
47/55 Test #47: test_file_config_loader ...............   Passed    0.03 sec
      Start 48: test_task_placement
48/55 Test #48: test_task_placement ...................   Passed    0.03 sec
      Start 49: test_task_status
49/55 Test #49: test_task_status ......................   Passed    0.01 sec
      Start 50: test_sqlite_task_store
50/55 Test #50: test_sqlite_task_store ................   Passed    0.03 sec
      Start 51: test_task_statuses
51/55 Test #51: test_task_statuses ....................   Passed    0.04 sec
      Start 52: test_all_tasks
52/55 Test #52: test_all_tasks ........................   Passed    0.06 sec
      Start 53: test_utils
53/55 Test #53: test_utils ............................   Passed    0.06 sec
      Start 54: test_cgroup_resources
54/55 Test #54: test_cgroup_resources .................   Passed    0.03 sec
      Start 55: test_usable_fetcher
55/55 Test #55: test_usable_fetcher ...................   Passed    0.06 sec

93% tests passed, 4 tests failed out of 55

Total Test time (real) =  25.06 sec

The following tests FAILED:
      6 - test_worker (OTHER_FAULT)
     18 - test_thrift_monitor (OTHER_FAULT)
     26 - test_remote_runner (OTHER_FAULT)
     28 - test_scheduler (OTHER_FAULT)
Errors while running CTest
The command '/bin/bash -c ctest' returned a non-zero code: 8
snarkmaster commented 6 years ago

@menzow, the test failures you're seeing are almost certainly due to flaky tests — there are no known associated production bugs due to these failures. I do intend to clean these up at some point in the next 1-2 months, though. Thanks for your vigilance.

snarkmaster commented 6 years ago

I'll close this out since the issue title is not related to flaky tests, and the tests are off. Feel free to create a new issue for tests if you'd like an update when they are fixed.