Open igorvpcleao opened 3 years ago
@igorvpcleao
https://github.com/arm64-compat/apache-druid
I have implemented a workaround for the same, if you are using Mac with M1 chip processor, these images will greatly speed up your processes (from my experience). Note that images are only for development purpose.
Not to be used in production.
I haven't throughly tested the same, I would love to hear your comments.
Woho~.
Did you have do to anything special to get the arm build up and running?
Nothing special as such, to summarise the changes I guess only 2 things are mainly required:
But in my repo I did a little bit more changes, I shifted the mvn build step from image to ci to make it utilise the maven build cache next time.
And I did some related changes.
Although the image that I pushed is not currently working I tested after I posted the comment, I repeated the same steps on my local and it was working fine. I will be able to figure out the issue and correct over the next 2 days.
The UI is not loading and giving 404 instead to me it seems like the console didn't compile properly in the ci environment and nothing was reported while building, this is my initial guess, will drill down.
OK, thanks for trying this out. I think it'll be awesome to get a arm build to run well on travis.
@2bethere https://github.com/arm64-compat/apache-druid/issues/1 I identified the issue and fixed the same.
If you want you can give this image a try as well.
I can also contribute to the apache/druid
repo, the fixes in the docker file. I saw that the .travis build in the check already tests for ARM.
Just we might have to decide on a base image.
Yeah, on travis the console is skipped to improve build speed right now. We should probably try to push https://github.com/apache/druid/pull/11109 forward to get an arm build on docker hub.
Double checking this is what you are asking for?
Yes I may have mixed the commands while building on travis right now it is fixed in my image, I was asking if you would want to test out my image and see if everything works correctly or is there anyway I can contribute to the project to server arm compatible docker image. For me now I haven't seen any issues.
ARM64 image now can be built on the master branch from both Linux and Mac M1/M2. But I don't know when the official ARM64 image will be provided on Docker Hub.
Any update on this?
Let's add 2024 to the mix.
Any update on this? I'm pulling apache/druid:28.0.1
and still not seeing ARM support in docker hub.
I see the docs show how to manually build, but is it that much of a hassle to host them? (legitimately curious)
You can pull onofficial arm64 compatible image from https://github.com/arm64-compat/apache-druid
let me know if you want me to upgrade the version for the same or help contribute
Thanks, @anuragagarwal561994! It looks like there is "official support" now, you just have to build it yourself.
Unfortunately I wasn't able to compile main on my M1 last night. This is the error I'm getting, if anyone is more familiar with Java.
-------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.1.0:exec (generate-binary-license) on project distribution: Command execution failed.: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.1.0:exec (generate-binary-license) on project distribution: Command execution failed.
@dudo actually since it is more or less java, it is actually compiable multi platform. I will try to build it myself once, if there is more to look into, I will try to invest some time this week. Can you also tell me which version or tag you are trying to build
@dudo I have built the latest version 28.0.1
https://github.com/arm64-compat/apache-druid/pkgs/container/apache%2Fdruid/177604533?tag=28.0.1
you can try and use it for your local / staging setup. Please refrain from using it in production as I am just a maintainer of this repo / oreganization but I can't actively test these images and their features myself, I hope I am able to make your life easier :)
I second this. Would be very handy if we could run an "out-of-box" Druid image on AWS Graviton. Makes a lot of sense both for small and large scale setups, especially if sharing a K8s cluster with other ARM workloads.
I have built locally the latest version 30.0.0 but when I use it I face issues with middleManager container which crashes during ingestion.
Any specific issue you are seeing? Maybe some error logs will help.
No issue visible as my container is no longer visible when it crashes. Just exit code 137
OK, can you do a docker logs #container_id and provide some details there? It's hard to know what's crashing.
Here the last lines of logs
2024-06-20 17:30:27 2024-06-20T15:30:27,303 DEBUG [qtp141697265-160] org.apache.druid.jetty.RequestLog - 192.168.192.11 GET //192.168.192.11:8091/druid/worker/v1/chat/query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0/counters HTTP/1.1 200
2024-06-20 17:30:27 2024-06-20T15:30:27,305 INFO [qtp141697265-148] org.apache.druid.msq.exec.WorkerImpl - Finish received for task [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0]
2024-06-20 17:30:27 2024-06-20T15:30:27,305 DEBUG [qtp141697265-148] org.apache.druid.jetty.RequestLog - 192.168.192.11 POST //192.168.192.11:8091/druid/worker/v1/chat/query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0/finish HTTP/1.1 202
2024-06-20 17:30:27 2024-06-20T15:30:27,306 INFO [[query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0]-threading-task-runner-executor-1] org.apache.druid.msq.exec.LoadedSegmentDataProviderFactory - Waiting for any data server queries to be canceled.
2024-06-20 17:30:27 2024-06-20T15:30:27,306 WARN [controller-status-checker-0] org.apache.druid.msq.indexing.IndexerWorkerContext - Periodic fetch of controller location returned [ServiceLocations{locations=[], closed=true}]. Worker task [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0] will exit.
2024-06-20 17:30:27 2024-06-20T15:30:27,306 INFO [controller-status-checker-0] org.apache.druid.msq.exec.WorkerImpl - Stopping gracefully for taskId [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0]
2024-06-20 17:30:27 2024-06-20T15:30:27,308 INFO [threading-task-runner-executor-1] org.apache.druid.indexing.overlord.ThreadingTaskRunner - Removed task directory: var/druid/task/slot3/query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0
2024-06-20 17:30:27 2024-06-20T15:30:27,347 INFO [WorkerTaskManager-NoticeHandler] org.apache.druid.indexing.worker.WorkerTaskManager - Task [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-worker0_0] completed with status [SUCCESS].
2024-06-20 17:30:27 2024-06-20T15:30:27,471 DEBUG [qtp141697265-136] org.apache.druid.jetty.RequestLog - 127.0.0.1 GET //localhost:8091/status/health HTTP/1.1 200
2024-06-20 17:30:27 2024-06-20T15:30:27,514 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segment[19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z] spill[0] to disk in [401] ms (23,321 rows).
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segments: 19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted stats: processed rows: [46494], persisted rows[23321], persisted sinks: [1], persisted fireHydrants (across sinks): [1]
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted rows[23,321] and bytes[30,990,856] and removed all sinks & hydrants from memory in[408] millis
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persist is done.
2024-06-20 17:30:27 2024-06-20T15:30:27,522 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Spawning intermediate persist
2024-06-20 17:30:27 2024-06-20T15:30:27,565 INFO [[query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108]-threading-task-runner-executor-0] org.apache.druid.msq.exec.ControllerImpl - Controller will now wait for segments to be loaded. The query has already finished executing, and results will be included once the segments are loaded, even if this query is cancelled now.
2024-06-20 17:30:27 2024-06-20T15:30:27,567 INFO [query-e6f6ac5a-b2c1-41d8-bf4c-9798ef360108-segment-load-waiter-0] org.apache.druid.msq.exec.SegmentLoadStatusFetcher - Fetching segment load status for datasource[19b98542-5e68-4bfd-982c-8c45356fd76b] from broker
2024-06-20 17:30:27 2024-06-20T15:30:27,737 INFO [processing-0] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Incremental persist to disk because bytesCurrentlyInMemory[30995920] is greater than maxBytesInMemory[30994978].
2024-06-20 17:30:27 2024-06-20T15:30:27,897 DEBUG [qtp141697265-148] org.apache.druid.jetty.RequestLog - 192.168.192.8 GET //192.168.192.11:8091/druid/listen/v1/lookups HTTP/1.1 200
2024-06-20 17:30:27 2024-06-20T15:30:27,992 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segment[19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z] spill[1] to disk in [468] ms (23,173 rows).
2024-06-20 17:30:28 2024-06-20T15:30:28,001 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted in-memory data for segments: 19b98542-5e68-4bfd-982c-8c45356fd76b_vertex_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2024-06-20T15:30:26.849Z
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted stats: processed rows: [68525], persisted rows[23173], persisted sinks: [1], persisted fireHydrants (across sinks): [1]
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persisted rows[23,173] and bytes[30,990,704] and removed all sinks & hydrants from memory in[478] millis
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Persist is done.
2024-06-20 17:30:28 2024-06-20T15:30:28,002 INFO [[19e455bc-a674-4f9b-a550-436ff843709f_0_0:0]-batch-appenderator-persist] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Spawning intermediate persist
2024-06-20 17:30:28 2024-06-20T15:30:28,232 INFO [processing-4] org.apache.druid.segment.realtime.appenderator.BatchAppenderator - Incremental persist to disk because bytesCurrentlyInMemory[30995490] is greater than maxBytesInMemory[30994978].
I don't see any errors in the logs. What's the underlying machine you are testing this on? I assume this is native batch ingestion? (index_parallel task)? I can try to reproduce it.
MSQ ingestion using a parquet file. I launch 3 MSQ ingestions at the same time.
My machine is an Apple M2 Max
Dope, will try a repro with docker.
An alternative, is to build using buildx for amd64 platform
docker buildx build --platform linux/amd64
.
I have to deactivate 'Use Rosetta for x86_64/amd64 emulation on Apple Silicon'.
And after, reactivate 'Use Rosetta' and in your docker-compose add
platform: linux/amd64
cpuset: '0'
for all druid containers. It works but quite slow.
I think I got an arm64 build partially working locally in docker, but running into some service discovery issues. Will continue to chip away at this.
I'm also trying to figure out how docker images are published to docker hub from this project. Will report back as I make progress.
Hey @2bethere did you manage to make any progress here?
Description
apache/druid
repository on Dockerhub storesamd64
images only. I'd like to suggest generating officialarm64
images.Motivation
arm64
workloads are becoming more and more popular. If possible, Docker images should also be available for this architecture.