polyzos / advent-of-code-flink-paimon

5 stars 4 forks source link

failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0 #1

Open Ebaharloo opened 5 months ago

Ebaharloo commented 5 months ago

hello dera; i used your YAML from Flink Book, and i added Paimon to Jar folder of that and also on Dockerfile and create bellow YAML ( i attached YAML also ) , then i tried using this repositorys Readme.md and guide.md , when i create "measurements" table and other commands, until i run and create this job : SET 'pipeline.name' = 'Sensor Info Ingestion';

INSERT INTO sensor_info
SELECT * FROM `default_catalog`.`default_database`.sensor_info;

in Flink, Running Jobs i recived this "Root Exception" :

2024-04-23 18:14:17
java.io.IOException: java.io.UncheckedIOException: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
    at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.processElement(RowDataStoreWriteOperator.java:125)
    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.UncheckedIOException: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
    at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:77)
    at org.apache.paimon.io.StatsCollectingSingleFileWriter.<init>(StatsCollectingSingleFileWriter.java:58)
    at org.apache.paimon.io.RowDataFileWriter.<init>(RowDataFileWriter.java:58)
    at org.apache.paimon.io.RowDataRollingFileWriter.lambda$new$0(RowDataRollingFileWriter.java:51)
    at org.apache.paimon.io.RollingFileWriter.openCurrentWriter(RollingFileWriter.java:100)
    at org.apache.paimon.io.RollingFileWriter.write(RollingFileWriter.java:79)
    at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:113)
    at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:51)
    at org.apache.paimon.operation.AbstractFileStoreWrite.write(AbstractFileStoreWrite.java:114)
    at org.apache.paimon.table.sink.TableWriteImpl.writeAndReturn(TableWriteImpl.java:114)
    at org.apache.paimon.flink.sink.StoreSinkWriteImpl.write(StoreSinkWriteImpl.java:159)
    at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.processElement(RowDataStoreWriteOperator.java:123)
    ... 13 more
Caused by: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
    at org.apache.paimon.fs.local.LocalFileIO.newOutputStream(LocalFileIO.java:80)
    at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:68)
    ... 24 more 

i think its because, Paimont not or can't create needed files or folder on this direction, also i checked direction, this direction exist :/opt/flink/temp/paimon/default.db/measurements/ and there is file by the name of schema-0 nad no bucket-0 and all parquet files. schema-0 s files content is like informations of job: what i did wrong?!

{
  "id" : 0,
  "fields" : [ {
    "id" : 0,
    "name" : "sensor_id",
    "type" : "BIGINT"
  }, {
    "id" : 1,
    "name" : "reading",
    "type" : "DECIMAL(5, 1)"
  } ],
  "highestFieldId" : 1,
  "partitionKeys" : [ ],
  "primaryKeys" : [ ],
  "options" : {
    "bucket" : "1",
    "schema.2.expr" : "PROCTIME()",
    "schema.2.data-type" : "TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL",
    "schema.2.name" : "event_time",
    "bucket-key" : "sensor_id",
    "file.format" : "parquet"
  },
  "timeMillis" : 1713882941002
}
version: "3.7"
networks:
  redpanda_network:
    driver: bridge
volumes:
  redpanda: null
services:
  redpanda:
    command:
      - redpanda
      - start
      - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:19092
      - --advertise-kafka-addr internal://redpanda:9092,external://localhost:19092
      - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:18082
      - --advertise-pandaproxy-addr internal://redpanda:8082,external://localhost:18082
      - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:18081
      - --rpc-addr redpanda:33145
      - --advertise-rpc-addr redpanda:33145
      - --smp 1
      - --memory 1G
      - --mode dev-container
      - --default-log-level=debug
    image: docker.redpanda.com/redpandadata/redpanda:v23.2.3
    container_name: redpanda
    volumes:
      - redpanda:/var/lib/redpanda/data
    networks:
      - redpanda_network
    ports:
      - "18081:18081"
      - "18082:18082"
      - "19092:19092"
      - "19644:9644"
  console:
    container_name: redpanda-console
    image: docker.redpanda.com/vectorized/console:v2.3.0
    networks:
      - redpanda_network
    entrypoint: /bin/sh
    command: -c 'echo "$$CONSOLE_CONFIG_FILE" > /tmp/config.yml; /app/console'
    environment:
      CONFIG_FILEPATH: /tmp/config.yml
      CONSOLE_CONFIG_FILE: |
        kafka:
          brokers: ["redpanda:9092"]
          schemaRegistry:
            enabled: true
            urls: ["http://redpanda:8081"]
        redpanda:
          adminApi:
            enabled: true
            urls: ["http://redpanda:9644"]
    ports:
      - "8080:8080"
    depends_on:
      - redpanda
  jobmanager:
    build: .
    container_name: jobmanager
    ports:
      - "8081:8081"
      - "9249:9249"
    command: jobmanager
    networks:
      - redpanda_network
    volumes:
      - ./jars/:/opt/flink/jars
      - ./logs/flink/:/opt/flink/temp
    environment:
      - |
        FLINK_PROPERTIES=
        jobmanager.rpc.address: jobmanager
        jobmanager.scheduler: Adaptive
        metrics.reporters: prom
        metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
        metrics.reporter.prom.port: 9249
        state.backend.rocksdb.metrics.actual-delayed-write-rate: true
        state.backend.rocksdb.metrics.background-errors: true
        state.backend.rocksdb.metrics.block-cache-capacity: true
        state.backend.rocksdb.metrics.estimate-num-keys: true
        state.backend.rocksdb.metrics.estimate-live-data-size: true
        state.backend.rocksdb.metrics.estimate-pending-compaction-bytes: true
        state.backend.rocksdb.metrics.num-running-compactions: true
        state.backend.rocksdb.metrics.compaction-pending: true
        state.backend.rocksdb.metrics.is-write-stopped: true
        state.backend.rocksdb.metrics.num-running-flushes: true
        state.backend.rocksdb.metrics.mem-table-flush-pending: true
        state.backend.rocksdb.metrics.block-cache-usage: true
        state.backend.rocksdb.metrics.size-all-mem-tables: true
        state.backend.rocksdb.metrics.num-live-versions: true
        state.backend.rocksdb.metrics.block-cache-pinned-usage: true
        state.backend.rocksdb.metrics.estimate-table-readers-mem: true
        state.backend.rocksdb.metrics.num-snapshots: true
        state.backend.rocksdb.metrics.num-entries-active-mem-table: true
        state.backend.rocksdb.metrics.num-deletes-imm-mem-tables: true
  taskmanager1:
    build: .
    container_name: taskmanager1
    depends_on:
      - jobmanager
    command: taskmanager
    networks:
      - redpanda_network
    ports:
      - "9250:9249"
    volumes:
      - ./logs/flink/:/opt/flink/temp
    environment:
      - |
        FLINK_PROPERTIES=
        jobmanager.rpc.address: jobmanager
        taskmanager.numberOfTaskSlots: 5
        metrics.reporters: prom
        metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
        metrics.reporter.prom.port: 9249
        state.backend.rocksdb.metrics.actual-delayed-write-rate: true
        state.backend.rocksdb.metrics.background-errors: true
        state.backend.rocksdb.metrics.block-cache-capacity: true
        state.backend.rocksdb.metrics.estimate-num-keys: true
        state.backend.rocksdb.metrics.estimate-live-data-size: true
        state.backend.rocksdb.metrics.estimate-pending-compaction-bytes: true
        state.backend.rocksdb.metrics.num-running-compactions: true
        state.backend.rocksdb.metrics.compaction-pending: true
        state.backend.rocksdb.metrics.is-write-stopped: true
        state.backend.rocksdb.metrics.num-running-flushes: true
        state.backend.rocksdb.metrics.mem-table-flush-pending: true
        state.backend.rocksdb.metrics.block-cache-usage: true
        state.backend.rocksdb.metrics.size-all-mem-tables: true
        state.backend.rocksdb.metrics.num-live-versions: true
        state.backend.rocksdb.metrics.block-cache-pinned-usage: true
        state.backend.rocksdb.metrics.estimate-table-readers-mem: true
        state.backend.rocksdb.metrics.num-snapshots: true
        state.backend.rocksdb.metrics.num-entries-active-mem-table: true
        state.backend.rocksdb.metrics.num-deletes-imm-mem-tables: true
  taskmanager2:
    build: .
    container_name: taskmanager2
    depends_on:
      - jobmanager
    command: taskmanager
    networks:
      - redpanda_network
    ports:
      - "9251:9249"
    volumes:
      - ./logs/flink/:/opt/flink/temp
    environment:
      - |
        FLINK_PROPERTIES=
        jobmanager.rpc.address: jobmanager
        jobmanager.scheduler: Adaptive
        taskmanager.numberOfTaskSlots: 5
        metrics.reporters: prom
        metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
        metrics.reporter.prom.port: 9249
        state.backend.rocksdb.metrics.actual-delayed-write-rate: true
        state.backend.rocksdb.metrics.background-errors: true
        state.backend.rocksdb.metrics.block-cache-capacity: true
        state.backend.rocksdb.metrics.estimate-num-keys: true
        state.backend.rocksdb.metrics.estimate-live-data-size: true
        state.backend.rocksdb.metrics.estimate-pending-compaction-bytes: true
        state.backend.rocksdb.metrics.num-running-compactions: true
        state.backend.rocksdb.metrics.compaction-pending: true
        state.backend.rocksdb.metrics.is-write-stopped: true
        state.backend.rocksdb.metrics.num-running-flushes: true
        state.backend.rocksdb.metrics.mem-table-flush-pending: true
        state.backend.rocksdb.metrics.block-cache-usage: true
        state.backend.rocksdb.metrics.size-all-mem-tables: true
        state.backend.rocksdb.metrics.num-live-versions: true
        state.backend.rocksdb.metrics.block-cache-pinned-usage: true
        state.backend.rocksdb.metrics.estimate-table-readers-mem: true
        state.backend.rocksdb.metrics.num-snapshots: true
        state.backend.rocksdb.metrics.num-entries-active-mem-table: true
        state.backend.rocksdb.metrics.num-deletes-imm-mem-tables: true
  postgres:
    image: postgres:latest
    networks:
      - redpanda_network
    container_name: postgres
    restart: always
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
    ports:
      - '5432:5432'
  prometheus:
    image: prom/prometheus:latest
    networks:
      - redpanda_network
    container_name: prometheus
    command:
      - '--config.file=/etc/prometheus/config.yaml'
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus:/etc/prometheus
  #      - ./prom_data:/prometheus
  grafana:
    image: grafana/grafana
    networks:
      - redpanda_network
    container_name: grafana
    ports:
      - "3000:3000"
    restart: unless-stopped
    environment:
      - GF_SECURITY_ADMIN_USER=grafana
      - GF_SECURITY_ADMIN_PASSWORD=grafana
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning
      - ./grafana/dashboards:/var/lib/grafana/dashboards
polyzos commented 5 months ago

@Ebaharloo can you try with the docker-compose.yaml file in this repo?

Also, are you running on windows?

Ebaharloo commented 5 months ago

im running on WSL . i can try that too

Ebaharloo commented 5 months ago

The same issue occurred in the docker-compose.yaml file within this repository, even on a different machine running WSL. What could be the cause?

2024-04-23 23:36:43
java.io.IOException: java.io.UncheckedIOException: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
    at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.processElement(RowDataStoreWriteOperator.java:125)
    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.UncheckedIOException: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
    at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:77)
    at org.apache.paimon.io.StatsCollectingSingleFileWriter.<init>(StatsCollectingSingleFileWriter.java:58)
    at org.apache.paimon.io.RowDataFileWriter.<init>(RowDataFileWriter.java:58)
    at org.apache.paimon.io.RowDataRollingFileWriter.lambda$new$0(RowDataRollingFileWriter.java:51)
    at org.apache.paimon.io.RollingFileWriter.openCurrentWriter(RollingFileWriter.java:100)
    at org.apache.paimon.io.RollingFileWriter.write(RollingFileWriter.java:79)
    at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:113)
    at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:51)
    at org.apache.paimon.operation.AbstractFileStoreWrite.write(AbstractFileStoreWrite.java:114)
    at org.apache.paimon.table.sink.TableWriteImpl.writeAndReturn(TableWriteImpl.java:114)
    at org.apache.paimon.flink.sink.StoreSinkWriteImpl.write(StoreSinkWriteImpl.java:159)
    at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.processElement(RowDataStoreWriteOperator.java:123)
    ... 13 more
Caused by: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
    at org.apache.paimon.fs.local.LocalFileIO.newOutputStream(LocalFileIO.java:80)
    at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:68)
    ... 24 more