Closed enahwe closed 7 years ago
Hi enahwe, thanks very much for your solution and interest in the project. In this experimental branch (develop-mra), we provided a similar solution that you suggest to the problem: the absence of the file name being processed by flume. This solution was not added to the main public project because no tests for this new feature were included and also lack of time. I keep the issue open because, as you say, the file name provides very useful information. I am glad to read that your solution is working correctly. Best regards, Luis.
flume-ftp-source v.2.0.10 includes file's name and path into event header best Luis
Hi,
The text below is more a solution of an issue than an issue without solution.
Because I needed to store FTP files to HDFS by keeping the source file name for every FTP file read, I found unfortunately that the value of the property %{basename} was empty !
What a pity ! Indeed sometimes the name of the source file can contains very useful informations you can extract and inject into the target file (into its path or its content), like transaction dates, business codifications, and even more...
So first I developed a Flume Interceptor thinking that it be would nicer, but nothing has changed ! The %{basename} property was empty yet !
Finally, I adapted your Java class named [Source] in order to propagate the property related on the source file name (%{basename}), and this for every FTP file read. So know it works well !
For information: I added the parameter 'elementName' to the methods 'readStream(..)' and 'processMessages(..)'.
I added the following line into the 'processMessages(..)' class: headers.put("basename", elementName);
Finally, I added the parameter 'elementName' by modifying the following code into the 'discoverElements(..)' method: if (!readStream(inputStream, position, elementName)) { inputStream = null; }
Please see below the content of the new class [Source] to replace by the current class if someone need to use the %{basename} property (the source file name for every FTP source file read):