DARMA-tasking / vt

DARMA/vt => Virtual Transport
Other
36 stars 9 forks source link

#2174: Integrate vt-tv api into lb data export #2294

Closed pierrepebay closed 2 months ago

pierrepebay commented 6 months ago

Fixes #2174

github-actions[bot] commented 6 months ago

Pipelines results

PR tests (gcc-12, ubuntu, mpich, verbose)

Build for ee315965cbeb830a9186f26e6d8490e01918a7fe (2024-05-28 17:09:22 UTC)

FAILED: src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o 
/usr/bin/ccache /usr/lib/ccache/g++ -DJSON_USE_IMPLICIT_CONVERSIONS=1 -DVT_NO_COLOR_ENABLED -I/vt/lib/CLI -I/build/vt/release -I/vt/src -I/vt/lib/json/include -I/vt/lib/brotli/c/include -I/vt/lib/libfort/lib -isystem /vt/lib/fmt/include -isystem /vt/lib/EngFormat-Cpp/include -isystem /build/checkpoint/install/include -O3 -DNDEBUG -Wall -pedantic -Wshadow -Wno-unknown-pragmas -Wsign-compare -ftemplate-backtrace-limit=100 -Werror -std=c++17 -MD -MT src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o -MF src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o.d -o src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx.o -c /build/vt/src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx
In file included from /vt/src/vt/vrt/collection/balance/node_lb_data.h:57,
                 from /vt/src/vt/vrt/collection/manager.impl.h:69,
                 from /vt/src/vt/vrt/collection/manager.h:1773,
                 from /vt/src/vt/scheduler/scheduler.cc:49,
                 from /build/vt/src/CMakeFiles/vt.dir/Unity/unity_20_cxx.cxx:7:
/vt/src/vt/vrt/collection/balance/lb_data_holder.h:52:12: fatal error: vt-tv/api/info.h: No such file or directory
   52 | #  include <vt-tv/api/info.h>
      |            ^~~~~~~~~~~~~~~~~~
compilation terminated.

Build log


PR tests (gcc-12, ubuntu, mpich, verbose, kokkos)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, alpine, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-9, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-11, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-12, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (gcc-9, ubuntu, mpich, zoltan, json schema test)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-14, ubuntu, mpich, verbose)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (intel icpx, ubuntu, mpich, verbose)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (clang-10, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (nvidia cuda 11.2, gcc-9, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&vt::vrt::collection::lb::GreedyLB::collectHandler, Target=vt::objgroup::proxy::ProxyElm<vt::vrt::collection::lb::GreedyLB>]"
          detected during:
            instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&vt::vrt::collection::lb::GreedyLB::collectHandler, Target=vt::objgroup::proxy::ProxyElm<vt::vrt::collection::lb::GreedyLB>]" 
/vt/src/vt/objgroup/proxy/proxy_objgroup.impl.h(221): here
            instantiation of "vt::objgroup::proxy::Proxy<ObjT>::PendingSendType vt::objgroup::proxy::Proxy<ObjT>::reduce<f,Op,Target,Args...>(Target, Args &&...) const [with ObjT=vt::vrt::collection::lb::GreedyLB, f=&vt::vrt::collection::lb::GreedyLB::collectHandler, Op=vt::collective::PlusOp, Target=vt::objgroup::proxy::ProxyElm<vt::vrt::collection::lb::GreedyLB>, Args=<vt::vrt::collection::lb::GreedyPayload>]" 
/vt/src/vt/vrt/collection/balance/greedylb/greedylb.cc(222): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]" 
/vt/examples/callback/callback.cc(147): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]" 
/vt/examples/callback/callback.cc(153): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&MyObj::handler, Target=vt::objgroup::proxy::ProxyElm<MyObj>]" 
/vt/examples/callback/callback.cc(147): here

/vt/src/vt/pipe/pipe_manager.impl.h(135): warning: missing return statement at end of non-void function "vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]"
          detected during instantiation of "auto vt::pipe::PipeManager::makeSend<f,Target>(Target) [with f=&colHan, Target=vt::vrt::collection::VrtElmProxy<MyCol, vt::Index1D>]" 
/vt/examples/callback/callback.cc(153%0D%0A%0D%0A%0D%0A ==> And there is more. Read log. <==

Build log


PR tests (gcc-11, ubuntu, mpich, trace runtime, coverage)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

Compilation - successful

Testing - passed

Build log


PR tests (nvidia cuda 12.2.0, gcc-9, ubuntu, mpich, verbose)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

/vt/lib/CLI/CLI/CLI11.hpp(1029): warning #2361-D: invalid narrowing conversion from "double" to "unsigned long"
          TT { std::declval<CC>() }
               ^
          detected during:
            instantiation of "vt::CLI::detail::is_direct_constructible<T, C>::test [with T=std::vector<std::string, std::allocator<std::string>>, C=double]" based on template arguments <std::vector<std::string, std::allocator<std::string>>, double> at line 1041
            instantiation of class "vt::CLI::detail::is_direct_constructible<T, C> [with T=std::vector<std::string, std::allocator<std::string>>, C=double]" at line 5005
            instantiation of "void vt::CLI::Option::results(T &) const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 5034
            instantiation of "T vt::CLI::Option::as<T>() const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 7315

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

/vt/lib/CLI/CLI/CLI11.hpp(1029): warning #2361-D: invalid narrowing conversion from "int" to "unsigned long"
          TT { std::declval<CC>() }
               ^
          detected during:
            instantiation of "vt::CLI::detail::is_direct_constructible<T, C>::test [with T=std::vector<std::string, std::allocator<std::string>>, C=int]" based on template arguments <std::vector<std::string, std::allocator<std::string>>, int> at line 1041
            instantiation of class "vt::CLI::detail::is_direct_constructible<T, C> [with T=std::vector<std::string, std::allocator<std::string>>, C=int]" at line 5005
            instantiation of "void vt::CLI::Option::results(T &) const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 5034
            instantiation of "T vt::CLI::Option::as<T>() const [with T=std::vector<std::string, std::allocator<std::string>>]" at line 7315

/vt/tests/perf/send_cost.cc(169): warning #177-D: variable "prevNode" was declared but never referenced
    auto const prevNode = (thisNode - 1 + num_nodes_) % num_nodes_;
               ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

Testing - passed

Build log


PR tests (intel icpc, ubuntu, mpich)

Build for 25361b69d720592cc0609e189de52e58b3026785 (2024-09-17 19:30:11 UTC)

remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-total-size 
remark #11076: To get full report use -qopt-report=4 -qopt-report-phase ipo
remark #11074: Inlining inhibited by limit max-size 
remark #11074: Inlining inhi%0D%0A%0D%0A%0D%0A ==> And there is more. Read log. <==

Build log


cwschilly commented 4 months ago

If VT_TV_ENABLED is ON, all tests will be run with --vt_tv and the default config file, test_vttv.yaml. This will not generate any pngs or meshes--it only tests that the data is passed from vt to vt-tv without any errors.

cwschilly commented 4 months ago

The vt-tv pipeline shows this error when trying to access the cached docker image:

#6 [ubuntu-cpp-vtk] importing cache manifest from ***/vt:amd64-ubuntu-22.04-gcc-12-gcc-12-vtk-cpp
#6 ERROR: failed to configure registry cache importer: docker.io/***/vt:amd64-ubuntu-22.04-gcc-12-gcc-12-vtk-cpp: not found

I think this will be resolved once we push to develop and the pushdockerimage.yml workflow runs. Is that correct?

pierrepebay commented 3 months ago

@cwschilly we should add lib/vt-tv in the .gitignore in this PR

cwschilly commented 3 months ago

@nlslatt @lifflander @JacobDomagala I want to point out this comment before we merge--will this problem be resolved once we push to develop, or is there a fix that I can do on this PR?

cwschilly commented 3 months ago

Rerunning CI after merging the vttv#85

cwschilly commented 3 months ago

CI fails on the gcc-9, ubuntu, mpich, zoltan pipeline due to setting VT_CI_TEST_LB_SCHEMA=1:

##[error]Traceback (most recent call last):
##[error]  File "/vt/scripts/compare_lb_data_file.py", line 4, in <module>
##[error]    from deepdiff import DeepDiff
##[error]  File "/usr/local/lib/python3.8/dist-packages/deepdiff/__init__.py", line 10, in <module>
##[error]    from .diff import DeepDiff
##[error]  File "/usr/local/lib/python3.8/dist-packages/deepdiff/diff.py", line 30, in <module>
##[error]    from deepdiff.distance import DistanceMixin, logarithmic_similarity
##[error]  File "/usr/local/lib/python3.8/dist-packages/deepdiff/distance.py", line 1, in <module>
##[error]    import numpy as np
##[error]ModuleNotFoundError: No module named 'numpy'
##[error]The process '/usr/bin/docker' failed with exit code 2

A similar error exists on develop. To recreate, run from vt dir:

docker-compose run -e BUILD_TYPE=debug -e VT_CI_TEST_LB_SCHEMA=1 ubuntu-cpp

This results in:

Traceback (most recent call last):
  File "/vt/scripts/compare_lb_data_file.py", line 4, in <module>
    from deepdiff import DeepDiff
ModuleNotFoundError: No module named 'deepdiff'

Relevant lines in the ubuntu-gnu-cpp.dockerfile:

https://github.com/DARMA-tasking/vt/blob/4530e434eafa2807604b55322a4c89ac3e5e1164/ci/docker/ubuntu-gnu-cpp.dockerfile#L95-L96


I'll mark this issue as In progress until this is resolved.

This was fixed by #2341

thearusable commented 2 months ago

@cwschilly I reran the failing gcc-9 job after pushing a new Docker image to Docker Hub. It seems to work now.

cwschilly commented 2 months ago

@thearusable Awesome, thanks! I'll rebase and then this should be ready for SNL reviews

cwschilly commented 2 months ago

@lifflander @nlslatt

I think we should wait to merge this PR until there is a stable vt-tv release so that we can target that release when we build vt-tv within vt (we currently just build master).

This would protect against future pushes to vt-tv's master branch that may break the pipeline. Then, before every subsequent release of both vt and vt-tv, we would just need to reassess that vt-tv is still compatible with vt.