milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.78k stars 2.86k forks source link

[Enhancement]: IBM Power (ppc64le) support #29566

Open sumitd2 opened 9 months ago

sumitd2 commented 9 months ago

Is there an existing issue for this?

What would you like to be added?

I belong to the IBM Power porting team. We have recently developed patches to build Milvus on ppc64le, and would like to open a PR to upstream the changes to this repo. The major changes were adjusting the versions for some conan installed dependencies, and creating a local cmake repo for conan because the conan binary package for cmake currently does not have power binaries. So we need to somehow find a way to ensure cmake supports power, and any future dependency version modifications work on all architectures. Do you have any suggestions?

Why is this needed?

There are customers who wish to use Milvus on Power.

Anything else?

https://github.com/ppc64le/build-scripts/pull/3467

alexanderguzhva commented 3 months ago

I don't access to test machine, but I think that https://github.com/zilliztech/knowhere/pull/665 should fix the problem

sumitd2 commented 3 months ago

@alexanderguzhva FYI Now its building fine

sumitd2 commented 2 months ago

@locustbaby @xiaofan-luan I ran the build and tests on the latest master today and got the following failures. Can you please help?

FAIL    github.com/milvus-io/milvus/internal/datanode
FAIL    github.com/milvus-io/milvus/internal/storage
FAIL    github.com/milvus-io/milvus/pkg/mq/msgdispatcher
FAIL    github.com/milvus-io/milvus/pkg/mq/msgstream
FAIL    github.com/milvus-io/milvus/pkg/mq/msgstream/mqwrapper/pulsar
FAIL    github.com/milvus-io/milvus/pkg/streaming/walimpls/impls/pulsar
FAIL    github.com/milvus-io/milvus/cmd/tools/config
xiaofan-luan commented 2 months ago

@locustbaby @xiaofan-luan I ran the build and tests on the latest master today and got the following failures. Can you please help?

FAIL    github.com/milvus-io/milvus/internal/datanode
FAIL    github.com/milvus-io/milvus/internal/storage
FAIL    github.com/milvus-io/milvus/pkg/mq/msgdispatcher
FAIL    github.com/milvus-io/milvus/pkg/mq/msgstream
FAIL    github.com/milvus-io/milvus/pkg/mq/msgstream/mqwrapper/pulsar
FAIL    github.com/milvus-io/milvus/pkg/streaming/walimpls/impls/pulsar
FAIL    github.com/milvus-io/milvus/cmd/tools/config

Is there a way we can see the ci/cd and logs?

From the case it seems that Power CPU didn't support to deploy pulsar.

sumitd2 commented 2 months ago

@xiaofan-luan Here is the output of make test-go :

milvus-v2.4.1-test-go.log docker-compose.yam.txt

@xiaofan-luan There is no ci, I run it manually on a power vm. You can refer to the last log file as above. Will have to rerun the build for the latest one.

sumitd2 commented 2 months ago

@xiaofan-luan

I got these four passing by using a third part pulsar image for Power:

ok    github.com/milvus-io/milvus/pkg/mq/msgdispatcher
ok    github.com/milvus-io/milvus/pkg/mq/msgstream
ok    github.com/milvus-io/milvus/pkg/mq/msgstream/mqwrapper/pulsar
ok    github.com/milvus-io/milvus/pkg/streaming/walimpls/impls/pulsar

So now there are 3 left:

  1. > FAIL github.com/milvus-io/milvus/internal/datanode This one is printing:
    
    --- FAIL: TestDataNode (0.01s)
    panic: fail to init rocksmq [recovered]
        panic: fail to init rocksmq

2. `> FAIL    github.com/milvus-io/milvus/internal/storage`
This one looks Azure related. Is there any additional setup required to run this? Can you confirm?

3. `> FAIL    github.com/milvus-io/milvus/cmd/tools/config`
I think you will have to fix this. Can you confirm? Its printing:

--- FAIL: TestYamlFile (0.26s) generate_test.go:44: Error Trace: /sumit/milvus/cmd/tools/config/generate_test.go:44 Error: configs/milvus.yaml is not consistent with paramtable, file: [autoindex:], code: [# Licensed to the LF AI & Data foundation under one]. Do not edit milvus.yaml directly. Test: TestYamlFile



Thanks.
xiaofan-luan commented 2 months ago
  1. we need log for this run to understand why rocksdb can not work 2.ignore azure for now is ok, do we have log for that?
  2. do we have pr already? I need to understand what part of this yaml has been changed.

Please generate a pr and I will start review on this

sumitd2 commented 2 months ago

https://github.com/milvus-io/milvus/pull/34978

sumitd2 commented 2 months ago

@alexanderguzhva This line is causing a missing header error during the power build: https://github.com/zilliztech/knowhere/blob/38e4c88ce2ffda1d6ca0bfed0afe3a34743ce0c5/include/knowhere/comp/thread_pool.h#L16 If I install openblas-devel and change it to #include <openblas/cblas.h> it builds fine.

alexanderguzhva commented 2 months ago

@sumitd2 will issue a fix shortly

alexanderguzhva commented 2 months ago

@sumitd2 if the following #ifdef sufficient? Or would you like me to use a particular one,?

#if defined(__PPC64__) || defined(__ppc64__) || defined(__PPC64LE__) || defined(__ppc64le__)

Thanks.

sumitd2 commented 2 months ago

also add __powerpc64__

alexanderguzhva commented 2 months ago

@sumitd2 https://github.com/zilliztech/knowhere/pull/754 Please let me know if it works. I will ask guys to stamp if it does.

sumitd2 commented 2 months ago

@alexanderguzhva it worked, thank you!

There is another error during the build (make). Not sure if you can fix it but posting it:

Building Milvus ...
# github.com/bytedance/sonic/internal/rt
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:102:6: missing function body
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:104:6: missing function body
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:114:6: missing function body
make: *** [Makefile:82: milvus] Error 1

@xiaofan-luan

alexanderguzhva commented 2 months ago

@sumitd2 yep, I believe that this is out of my immediate knowledge, sorry about that

stale[bot] commented 4 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

xiaofan-luan commented 3 weeks ago

@alexanderguzhva it worked, thank you!

There is another error during the build (make). Not sure if you can fix it but posting it:

Building Milvus ...
# github.com/bytedance/sonic/internal/rt
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:102:6: missing function body
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:104:6: missing function body
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:114:6: missing function body
make: *** [Makefile:82: milvus] Error 1

@xiaofan-luan

@haorenfsa see that there is a problem for sonic on ppc arch. guess it dosn't support ppc as well. can we compile it as an optional?

sumitd2 commented 3 weeks ago

@alexanderguzhva it worked, thank you! There is another error during the build (make). Not sure if you can fix it but posting it:

Building Milvus ...
# github.com/bytedance/sonic/internal/rt
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:102:6: missing function body
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:104:6: missing function body
/root/go/pkg/mod/github.com/bytedance/sonic@v1.9.1/internal/rt/gcwb.go:114:6: missing function body
make: *** [Makefile:82: milvus] Error 1

@xiaofan-luan

@haorenfsa see that there is a problem for sonic on ppc arch. guess it dosn't support ppc as well. can we compile it as an optional?

@xiaofan-luan I am currently working on building master, and have already fixed this error by creating a sonic_ppc64le.go which uses encoding/json.

haorenfsa commented 3 weeks ago

/assign

haorenfsa commented 3 weeks ago

@sumitd2 to save the work, I think we could just disable sonic json for ppc64le.

haorenfsa commented 3 weeks ago

@sumitd2 I just checked the code, and it's already optional. So we can just use make MILVUS_GO_BUILD_TAGS=dynamic to disable sonic.

image
sumitd2 commented 3 weeks ago

@sumitd2 I just checked the code, and it's already optional. So we can just use make MILVUS_GO_BUILD_TAGS=dynamic to disable sonic.

@haorenfsa That will disable it for every architecture, so why would you want to do that? I have already done the work, just needs a PR.

haorenfsa commented 3 weeks ago

@sumitd2 I just checked the code, and it's already optional. So we can just use make MILVUS_GO_BUILD_TAGS=dynamic to disable sonic.

@haorenfsa That will disable it for every architecture, so why would you want to do that? I have already done the work, just needs a PR.

Um, I think you may have misunderstood. What I meant is that you can turn it off when compiling on powerPC. We still keep it on by default for other architectures.

haorenfsa commented 3 weeks ago

And of course it's great that you implement it for ppc. Your patch is very welcomed.

But for other not covered architectures, like mips or riscv maybe, if one also want to build milvus on that, one can use the command I mentioned above. There won't be any difference as long as one use GRPC sdk not RESTful API.

sumitd2 commented 3 weeks ago

@xiaofan-luan I have the latest master building on Power, and the C++ tests are passing (except for the Azure ones). This feels like the right time to set up your CI. Unfortunately GHA runner cannot be installed on the OSU VM we have provided you because it does not support Power. So the only option I see is Buildkite, which is free for open source projects. It integrates well with GHA so you will be able to see the Power job in the list on the PR page. Let me know if you are fine with it.

@locustbaby I need access to the OSU VM that has been assigned to you, to be able to setup the CI agent for the Power builds.

cc: @seth-priya

xiaofan-luan commented 3 weeks ago

@chuanfeng

could you helo to invesigate on that? Let's merge this pr asap