mojaloop / project

Repo to track product development issues for the Mojaloop project.
Other
24 stars 15 forks source link

Mojaloop Helm deployments are not compatible when deployed to ARM-arch based hosts #2317

Open mdebarros opened 3 years ago

mdebarros commented 3 years ago

Summary:

Testing Mojaloop Helm deployment on a 2021 Mac with the new M1 (ARM arch) CPU results in the deployment failing due to compatibility issues.

Specific issues identified are with:

Severity: Low

Priority: Medium

Expected Behavior Mojaloop Helm deployments should startup on ARM-arch based hosts as per AMD64.

Steps to Reproduce

  1. Buy M1 Mac
  2. Install Pre-requisites (Docker, etc)
  3. Deployment Mojaloop Helm (v12.x or v13.x)

Specifications

Notes:

tdaly61 commented 2 years ago

hi @elnyry-sam-k , @mdebarros : I am taking a slight side road for the next week or 2 working on mini-loop deployment and hopefully making it work with a) ML 13.1.0 and b) Arm64 . So I assigned this to me as it will likely be necessary for me to get kafka chart working on arm to be able to do this. FWIW: I have kafka docker container working on arm , biggest issue is where to host an arm64 purpose built chart and how much work to do on this given that the bitnami folks are eventually going to get there (though not at all sure that is still not going to be years away).

tdaly61 commented 2 years ago

also FWIW: I am making progress on this , I have kafka and zookeeper charts running and connecting ok.

elnyry-sam-k commented 2 years ago

Thanks for the update, Tom.. I've moved it to in progress based on your comment..

Regarding the question, maybe we can discuss on one of the calls (I suppose we can get some EC2 instances with ARM if required on AWS)

tdaly61 commented 2 years ago

Ok it looks like npm / node.js is built on chrome javascript and the chrome javascript engine is c++ which is architecture specific AND baked into the mojaloop images. Unlike the external charts for zookeeper and mysql, kafka , this one is not a run-time issue but a "build-time" issue and therefore harder. ... I am looking at options for solving and for support , but needless to say this just got as I say ..harder

millerabel commented 2 years ago

This is true for all of the official Docker images that we depend on to build Mojaloop. Alpine Linux, NodeJS, redis, MySQL, nginx, etc…

All of these images are available for both X86-64 and ARM64 hardware architectures.

So building Mojaloop Docker images for both X86-64 and ARMV8 should be possible. We have no need to support 32-bit environments or other architectures so the dimensionality of the image set remains low.

We can then assemble charts for target use environments like developer X86-64 laptop, developer Mac M1 (ARM64), and runtime X86-64 server. Each environment can have a tuned config appropriate to use.

On the server, we have previously explored “small, medium, and large” charts that describe different runtime configs to support varying operating loads (and with different cost controls for hosted cloud).

On Apr 2, 2022, at 3:14 AM, Tom Daly @.***> wrote:

 Ok it looks like npm / node.js is built on chrome javascript and the chrome javascript engine is c++ which is architecture specific AND baked into the mojaloop images. Unlike the external charts for zookeeper and mysql, kafka , this one is not a run-time issue but a "build-time" issue and therefore harder. ... I am looking at options for solving and for support , but needless to say this just got as I say ..harder

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

tdaly61 commented 2 years ago

Hi @millerabel well the good news is that MySQL , Kafka and Nginx (or your ingress controller of choice) are all configurable at deploy-time and I have all these running on arm64 vm in the cloud. It is more Alpine Linux, and NodeJS right now that I am working on i.e. the bits that are baked into "our container images" at build-time.

Your comments are very encouraging as I think I am realising that this is an increasingly important AI , not just because of the Apple MAC dev issue but because of AWS Graviton and other cloud vendors (likely including google) also heading down the ARM path. I am currently doing dev/test on the Oracle "always free tier" where they give you 4 Arm64 cpus and 24GB ram and everything else you need to do decent dev/test for free and no expiration date.

Now that said, is there anything written that says how the small/medium/large images were going to be hosted and differentiated and test that we could apply here for the different architectures ? TIA

millerabel commented 2 years ago

Very glad the platforms are factored out and can be assembled at deploy time.

We have not previously considered the hardware architecture dimension, so this is new ground. I might look to other Docker standard images to see how naming and versioning is done and adopt something similar.

Server Sizing

As for the Small/Med/Large for runtime, we would likely defer this to actual implementers. We did this as an early demonstration when we first brought up Rancher/etc to demonstrate how a runtime admin would manage changes in load in production. Miguel / Sam can likely recall that work.

I’d hear from others on this, and think we might need to anticipate different deployment scales, but leave it to a second phase of configuration work, as templates for use by implementers. The demonstrated idea was to use Rancher to “heal” the system from one structured configuration into another without downtime.

I don’t think we know enough yet about production size variability to specify this in the release code. And might be distinct choices in Azure or AWS or Google clouds that suggest cost/perf dimensions we would capture.

Consider server configs for dev/sandbox/prod and single-dev laptop and of course our own AWS ci/cd cloud config that we will use directly to test and build.

What say you all?

— Miller

Miller Abel @.***

On Apr 4, 2022, at 3:31 AM, Tom Daly @.***> wrote:

Hi @millerabel https://github.com/millerabel well the good news is that MySQL , Kafka and Nginx (or your ingress controller of choice) are all configurable at deploy-time and I have all these running on arm64 vm in the cloud. It is more Alpine Linux, and NodeJS right now that I am working on i.e. the bits that are baked into "our container images" at build-time.

Your comments are very encouraging as I think I am realising that this is an increasingly important AI , not just because of the Apple MAC dev issue but because of AWS Graviton and other cloud vendors (likely including google) also heading down the ARM path. I am currently doing dev/test on the Oracle "always free tier" where they give you 4 Arm64 cpus and 24GB ram and everything else you need to do decent dev/test for free and no expiration date.

Now that said, is there anything written that says how the small/medium/large images were going to be hosted and differentiated so that they could be configured in the charts at run-time ? Alternatively can I assume that we would do something create image names (on dockerhub) to say arm64/mojaloop/central-ledger ? Is that the sort of approach that we being discussed ? If so what about testing different arch's , anything discussed or any decisions made there already ?

— Reply to this email directly, view it on GitHub https://github.com/mojaloop/project/issues/2317#issuecomment-1087386066, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6OJ6FFFE6MX6NEU2NZU23VDLAH7ANCNFSM47QMSBKA. You are receiving this because you were mentioned.

tdaly61 commented 2 years ago

Very glad the platforms are factored out and can be assembled at deploy time. We have not previously considered the hardware architecture dimension, so this is new ground. I might look to other Docker standard images to see how naming and versioning is done and adopt something similar. Server Sizing As for the Small/Med/Large for runtime, we would likely defer this to actual implementers. We did this as an early demonstration when we first brought up Rancher/etc to demonstrate how a runtime admin would manage changes in load in production. Miguel / Sam can likely recall that work. I’d hear from others on this, and think we might need to anticipate different deployment scales, but leave it to a second phase of configuration work, as templates for use by implementers. The demonstrated idea was to use Rancher to “heal” the system from one structured configuration into another without downtime. I don’t think we know enough yet about production size variability to specify this in the release code. And might be distinct choices in Azure or AWS or Google clouds that suggest cost/perf dimensions we would capture. Consider server configs for dev/sandbox/prod and single-dev laptop and of course our own AWS ci/cd cloud config that we will use directly to test and build. What say you all? — Miller Miller Abel @. On Apr 4, 2022, at 3:31 AM, Tom Daly @.> wrote: Hi @millerabel https://github.com/millerabel well the good news is that MySQL , Kafka and Nginx (or your ingress controller of choice) are all configurable at deploy-time and I have all these running on arm64 vm in the cloud. It is more Alpine Linux, and NodeJS right now that I am working on i.e. the bits that are baked into "our container images" at build-time. Your comments are very encouraging as I think I am realising that this is an increasingly important AI , not just because of the Apple MAC dev issue but because of AWS Graviton and other cloud vendors (likely including google) also heading down the ARM path. I am currently doing dev/test on the Oracle "always free tier" where they give you 4 Arm64 cpus and 24GB ram and everything else you need to do decent dev/test for free and no expiration date. Now that said, is there anything written that says how the small/medium/large images were going to be hosted and differentiated so that they could be configured in the charts at run-time ? Alternatively can I assume that we would do something create image names (on dockerhub) to say arm64/mojaloop/central-ledger ? Is that the sort of approach that we being discussed ? If so what about testing different arch's , anything discussed or any decisions made there already ? — Reply to this email directly, view it on GitHub <#2317 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6OJ6FFFE6MX6NEU2NZU23VDLAH7ANCNFSM47QMSBKA. You are receiving this because you were mentioned.

@millerabel : one clarification from me. I was not trying to pursue the idea of small/medium/large beyond what it could help us with in supporting both x86 and arm64 architectures. cheers.

tdaly61 commented 2 years ago

Post conference update : I am trying to build the images listed below and as you can see all are building except:

The issue with auth-service is that there is an issue with the sqlite3 build and the others I have not yet diagnosed.

found image [als_oracle_pathfinder_local] so skipping build for now found image [finance_portal_backend_service_local] so skipping build for now found image [transaction_requests_service_local] so skipping build for now no existing image for [auth-service] ; building ... Error building docker image for [auth_service_local] found image [email_notifier_local] so skipping build for now found image [thirdparty_api_svc_local] so skipping build for now found image [ml_test_toolkit_local] so skipping build for now no existing image for [als-consent-oracle] ; building ... Error building docker image for [als_consent_oracle_local] found image [quoting_service_local] so skipping build for now found image [central_settlement_local] so skipping build for now found image [account_lookup_service_local] so skipping build for now found image [simulator_local] so skipping build for now found image [settlement_management_local] so skipping build for now found image [ml_api_adapter_local] so skipping build for now no existing image for [central-kms] ; building ... Error building docker image for [central_kms_local] found image [buld_api_adapter_local] so skipping build for now found image [operator_settlement_local] so skipping build for now found image [central_ledger_local] so skipping build for now found image [finance_portal_ui_local] so skipping build for now no existing image for [ml-testing-toolkit-ui] ; building ... Error building docker image for [ml_testing_tookit_ui_local] found image [central_event_processor_local] so skipping build for now