jcustenborder / kafka-connect-spooldir

Kafka Connect connector for reading CSV files into Kafka.
Apache License 2.0
165 stars 124 forks source link

Fails to delete subdirectory - but there is none? #217

Open talljhawkins opened 1 week ago

talljhawkins commented 1 week ago

Confused here - can you help to see if this is an error or just me badly configuring: I have the following: "input.file.pattern": "^.*\.fuu$", "input.path": "c:\B2BiInput\LWFW\input", "finished.path": "c:\B2BiInput\LWFW\finished", "error.path": "c:\B2BiInput\LWFW\errors",

The processing works just fine but it always logs an error for every file it processes:

Failed to delete input.path sub-directory: c:\B2BiInput\LWFW\input\<I've redacted the filename.fuu> (com.github.jcustenborder.kafka.connect.spooldir.InputFile:242)

Is it trying to delete the subdirectory? c:\B2BiInput\LWFW or perhaps c:\B2BiInput\LWFW\input - and why? The file gets deleted/moved OK so that's not a problem.

I've tried this on a few systems and always the same error.

jcustenborder commented 1 week ago

hmmm. Of hand I have not seen anything like this. Can you enable trace logging?

abrahamgreyson commented 5 days ago

Same here, the file has been deleted after message commited to topic, but log shows the same:

[2024-09-13 18:32:58,655] INFO WorkerSourceTask{id=data-0} Committing offsets for 317995 acknowledged messages (org.apache.kafka.connect.runtime.WorkerSourceTask)
[2024-09-13 18:33:02,788] ERROR Failed to delete input.path sub-directory: /usr/share/upstream/data/data-20240911-1.csv (com.github.jcustenborder.kafka.connect.spooldir.InputFile)

Here is the conf:

{
  "tasks.max": "1",
  "connector.class": "com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector",
  "input.path": "/usr/share/upstream/data",
  "error.path": "/usr/share/upstream/error",
  "finished.path": "/usr/share/upstream/finished",
  "input.file.pattern": "data.*\\.csv",
  "halt.on.error": "true",
  "topic": "data",
  "csv.first.row.as.header": "true",
  "schema.generation.enabled": "true",
  "cleanup.policy": "DELETE",
  "empty.poll.wait.ms": "180000",
  "csv.rfc.4180.parser.enabled": "true"
}

It's diffcult to me to change log level as I'm new to Java, but the connect container is based on Confluent/Connect