apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.53k stars 3.71k forks source link

arm64 Docker images openly available #11820

Open igorvpcleao opened 3 years ago

igorvpcleao commented 3 years ago

Description

apache/druid repository on Dockerhub stores amd64 images only. I'd like to suggest generating official arm64 images.

Motivation

arm64 workloads are becoming more and more popular. If possible, Docker images should also be available for this architecture.

anuragagarwal561994 commented 2 years ago

@igorvpcleao

https://github.com/arm64-compat/apache-druid

I have implemented a workaround for the same, if you are using Mac with M1 chip processor, these images will greatly speed up your processes (from my experience). Note that images are only for development purpose.

Not to be used in production.

I haven't throughly tested the same, I would love to hear your comments.

2bethere commented 2 years ago

Woho~.

Did you have do to anything special to get the arm build up and running?

anuragagarwal561994 commented 2 years ago

Nothing special as such, to summarise the changes I guess only 2 things are mainly required:

But in my repo I did a little bit more changes, I shifted the mvn build step from image to ci to make it utilise the maven build cache next time.

And I did some related changes.

Although the image that I pushed is not currently working I tested after I posted the comment, I repeated the same steps on my local and it was working fine. I will be able to figure out the issue and correct over the next 2 days.

The UI is not loading and giving 404 instead to me it seems like the console didn't compile properly in the ci environment and nothing was reported while building, this is my initial guess, will drill down.

2bethere commented 2 years ago

OK, thanks for trying this out. I think it'll be awesome to get a arm build to run well on travis.

anuragagarwal561994 commented 2 years ago

@2bethere https://github.com/arm64-compat/apache-druid/issues/1 I identified the issue and fixed the same.

If you want you can give this image a try as well.

I can also contribute to the apache/druid repo, the fixes in the docker file. I saw that the .travis build in the check already tests for ARM.

Just we might have to decide on a base image.

2bethere commented 2 years ago

Yeah, on travis the console is skipped to improve build speed right now. We should probably try to push https://github.com/apache/druid/pull/11109 forward to get an arm build on docker hub.

Double checking this is what you are asking for?

anuragagarwal561994 commented 2 years ago

Yes I may have mixed the commands while building on travis right now it is fixed in my image, I was asking if you would want to test out my image and see if everything works correctly or is there anyway I can contribute to the project to server arm compatible docker image. For me now I haven't seen any issues.

FrankChen021 commented 2 years ago

ARM64 image now can be built on the master branch from both Linux and Mac M1/M2. But I don't know when the official ARM64 image will be provided on Docker Hub.

m17kea commented 1 year ago

Any update on this?

dudo commented 9 months ago

Let's add 2024 to the mix.

Any update on this? I'm pulling apache/druid:28.0.1 and still not seeing ARM support in docker hub.

I see the docs show how to manually build, but is it that much of a hassle to host them? (legitimately curious)

anuragagarwal561994 commented 9 months ago

You can pull onofficial arm64 compatible image from https://github.com/arm64-compat/apache-druid

let me know if you want me to upgrade the version for the same or help contribute

dudo commented 9 months ago

Thanks, @anuragagarwal561994! It looks like there is "official support" now, you just have to build it yourself.

Unfortunately I wasn't able to compile main on my M1 last night. This is the error I'm getting, if anyone is more familiar with Java.


-------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.1.0:exec (generate-binary-license) on project distribution: Command execution failed.: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.1.0:exec (generate-binary-license) on project distribution: Command execution failed.
anuragagarwal561994 commented 9 months ago

@dudo actually since it is more or less java, it is actually compiable multi platform. I will try to build it myself once, if there is more to look into, I will try to invest some time this week. Can you also tell me which version or tag you are trying to build

anuragagarwal561994 commented 9 months ago

@dudo I have built the latest version 28.0.1

https://github.com/arm64-compat/apache-druid/pkgs/container/apache%2Fdruid/177604533?tag=28.0.1

you can try and use it for your local / staging setup. Please refrain from using it in production as I am just a maintainer of this repo / oreganization but I can't actively test these images and their features myself, I hope I am able to make your life easier :)

dmitry-livchak-qco commented 9 months ago

I second this. Would be very handy if we could run an "out-of-box" Druid image on AWS Graviton. Makes a lot of sense both for small and large scale setups, especially if sharing a K8s cluster with other ARM workloads.

fabricebaranski commented 5 months ago

I have built locally the latest version 30.0.0 but when I use it I face issues with middleManager container which crashes during ingestion.

2bethere commented 5 months ago

Any specific issue you are seeing? Maybe some error logs will help.

fabricebaranski commented 5 months ago

No issue visible as my container is no longer visible when it crashes. Just exit code 137

2bethere commented 5 months ago

OK, can you do a docker logs #container_id and provide some details there? It's hard to know what's crashing.

fabricebaranski commented 5 months ago

Here the last lines of logs

2024-06-20 17:30:27 2024-06-20T15:30:27,303 DEBUG [qtp141697265-160] org.apache.druid.jetty.RequestLog - 192.168.192.11 GET //192.168.192.11:8091/druid/worker/v1/chat/query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0/counters HTTP/1.1 200
2024-06-20 17:30:27 2024-06-20T15:30:27,305 INFO [qtp141697265-148] org.apache.druid.msq.exec.WorkerImpl - Finish received for task [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0]
2024-06-20 17:30:27 2024-06-20T15:30:27,305 DEBUG [qtp141697265-148] org.apache.druid.jetty.RequestLog - 192.168.192.11 POST //192.168.192.11:8091/druid/worker/v1/chat/query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0/finish HTTP/1.1 202
2024-06-20 17:30:27 2024-06-20T15:30:27,306 INFO [[query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0]-threading-task-runner-executor-1] org.apache.druid.msq.exec.LoadedSegmentDataProviderFactory - Waiting for any data server queries to be canceled.
2024-06-20 17:30:27 2024-06-20T15:30:27,306 WARN [controller-status-checker-0] org.apache.druid.msq.indexing.IndexerWorkerContext - Periodic fetch of controller location returned [ServiceLocations{locations=[], closed=true}]. Worker task [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0] will exit.
2024-06-20 17:30:27 2024-06-20T15:30:27,306 INFO [controller-status-checker-0] org.apache.druid.msq.exec.WorkerImpl - Stopping gracefully for taskId [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0]
2024-06-20 17:30:27 2024-06-20T15:30:27,308 INFO [threading-task-runner-executor-1] org.apache.druid.indexing.overlord.ThreadingTaskRunner - Removed task directory: var/druid/task/slot3/query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0
2024-06-20 17:30:27 2024-06-20T15:30:27,347 INFO [WorkerTaskManager-NoticeHandler] org.apache.druid.indexing.worker.WorkerTaskManager - Task [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0] completed with status [SUCCESS].
2024-06-20 17:30:27 2024-06-20T15:30:27,471 DEBUG [qtp141697265-136] org.apache.druid.jetty.RequestLog - 127.0.0.1 GET //localhost:8091/status/health HTTP/1.1 200
2024-06-20 17:30:27 2024-06-20T15:30:27,514 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segment[19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z] spill[0] to disk in [401] ms (23,321 rows).
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segments: 19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted stats: processed rows: [46494], persisted rows[23321], persisted sinks: [1], persisted fireHydrants (across sinks): [1]
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted rows[23,321] and bytes[30,990,856] and removed all sinks & hydrants from memory in[408] millis
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persist is done.
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Spawning intermediate persist
2024-06-20 17:30:27 2024-06-20T15:30:27,565 INFO [[query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108]-threading-task-runner-executor-0] org.apache.druid.msq.exec.ControllerImpl - Controller will now wait for segments to be loaded. The query has already finished executing, and results will be included once the segments are loaded, even if this query is cancelled now.
2024-06-20 17:30:27 2024-06-20T15:30:27,567 INFO [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-segment-load-waiter-0] org.apache.druid.msq.exec.SegmentLoadStatusFetcher - Fetching segment load status for datasource[19b98542-5e68-4bfd-982c-8c45356fd76b] from broker
2024-06-20 17:30:27 2024-06-20T15:30:27,737 INFO [processing-0] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Incremental persist to disk because bytesCurrentlyInMemory[30995920] is greater than maxBytesInMemory[30994978].
2024-06-20 17:30:27 2024-06-20T15:30:27,897 DEBUG [qtp141697265-148] org.apache.druid.jetty.RequestLog - 192.168.192.8 GET //192.168.192.11:8091/druid/listen/v1/lookups HTTP/1.1 200
2024-06-20 17:30:27 2024-06-20T15:30:27,992 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segment[19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z] spill[1] to disk in [468] ms (23,173 rows).
2024-06-20 17:30:28 2024-06-20T15:30:28,001 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segments: 19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted stats: processed rows: [68525], persisted rows[23173], persisted sinks: [1], persisted fireHydrants (across sinks): [1]
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted rows[23,173] and bytes[30,990,704] and removed all sinks & hydrants from memory in[478] millis
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persist is done.
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Spawning intermediate persist
2024-06-20 17:30:28 2024-06-20T15:30:28,232 INFO [processing-4] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Incremental persist to disk because bytesCurrentlyInMemory[30995490] is greater than maxBytesInMemory[30994978].
2bethere commented 5 months ago

I don't see any errors in the logs. What's the underlying machine you are testing this on? I assume this is native batch ingestion? (index_parallel task)? I can try to reproduce it.

fabricebaranski commented 5 months ago

MSQ ingestion using a parquet file. I launch 3 MSQ ingestions at the same time.

fabricebaranski commented 5 months ago

My machine is an Apple M2 Max

2bethere commented 5 months ago

Dope, will try a repro with docker.

fabricebaranski commented 5 months ago

An alternative, is to build using buildx for amd64 platform docker buildx build --platform linux/amd64 . I have to deactivate 'Use Rosetta for x86_64/amd64 emulation on Apple Silicon'. And after, reactivate 'Use Rosetta' and in your docker-compose add

    platform: linux/amd64
    cpuset: '0'

for all druid containers. It works but quite slow.

2bethere commented 5 months ago

I think I got an arm64 build partially working locally in docker, but running into some service discovery issues. Will continue to chip away at this.

I'm also trying to figure out how docker images are published to docker hub from this project. Will report back as I make progress.

m17kea commented 2 months ago

Hey @2bethere did you manage to make any progress here?