Closed jrnt30 closed 1 year ago
Noticed that there are some warnings now being thrown for recursive delete for the base directories as well, will take a look and get it fully working.
Please make this a required to opt-in feature since it will break changes for some people like us 🙏🏻
@yordis Appreciate the feedback and nudge for a configuration parameter. I updated the description of the issue to ensure it's clear the current and adjusted functionality present, it was a bit surprising to us the way it was currently working.
I don't believe that the changes proposed here should break an existing setup unless you have a folder structure on the server that contains nested directories in which case I think you would already be seeing the same issue that we are. Please let me know though if I am missing something!
(Also as a side note, if it is preferable to be more explicit it ProcessFiles
to operate directly off of the paths of the agent we can go that route too.)
I decided to just go ahead and adjust to using the explicit folder structure. Let me know if this feels safer/better.
Base: 40.81% // Head: 40.81% // Increases project coverage by +0.00%
:tada:
Coverage data is based on head (
319a37d
) compared to base (b8e8663
). Patch coverage: 62.50% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Has this been working for you?
@jrnt30 based on my career experience, the fewer assumptions, the better. These configurations are set up once every few times or once and never change.
In case of the changes, you probably want things to break if it wasn't expected.
So, your solution is something I would do. We use Terraform and Helm to provision things, so there is a minimal cost to it. We do not have any dynamic directory creation ever.
It has been working for us, but wanted to call out two things.
It was unclear to me at the time that the "absence of a path" meant that the SFTP user's working directory would be used for the xPath()
s. I haven't attempted to go back and explicitly setting these things to fully qualified (but non-existent) paths to see if that would have resolved this.
Additionally, using relative directories we found another issue that I haven't gone back to resolve yet. The Agent functionality differs between the SFTP and FTP clients when using relative paths for the xPaths
.
The FTP client cd
s into the different paths and pulls the files from that directory locally. This functions properly with our setup of having a non-root working directory for the FTP user's default path as well as a relative path to the folders we want to process.
The SFTP agent's code lists the files for each of the directory using the user's working directory joined with the relative paths. It then uses the relative path to attempt to Stat the file and download it, however when doing so the path that was constructed from the file listing does not pass the .Stat
command.
Has this been working for folks? My bad it hasn't been merged. I'm reading the comments again to understand what the tradeoffs are.
Edit: Can we rebase this PR off of the master branch? There have been some quality of life improvements and fixes.
The SFTP agent's code lists the files for each of the directory using the user's working directory joined with the relative paths. It then uses the relative path to attempt to Stat the file and download it, however when doing so the path that was constructed from the file listing does not pass the .Stat command.
Sounds like we need to resolve the path within the SFTP server before trying to stat?
Thanks for updating this PR. I'm comfortable merging if you think it's ready.
Hey Adam! Sorry for the delayed response, have been OOO for a bit. This has been working for us with the caveat that we adjusted to use absolute paths for the directories.
Server with a base root FTP folder of: /
User with a default working directly different: /user/jrnt30/
Any xPath
that is relative and not absolute: ./files
If the project would like to support both relative paths to the user's working directory, there is additional work that would need to be done to get SFTP working to match the behavior of the FTP processor. I looked at it briefly but didn't see any similar cd
like capability in the Golang SFTP client and stopped pursuing it.
Right. The SFTP client doesn't support cd
like behavior. I'm thinking that we want both clients to support:
/data/returned/
) ./returned/
) In your example I'd expect the resolved path to be /user/jrnt30/files
, right?
In your example I'd expect the resolved path to be
/user/jrnt30/files
, right?
Yep!
Right. The SFTP client doesn't support
cd
like behavior. I'm thinking that we want both clients to support:
- Absolute path provided. (e.g.
/data/returned/
)- Relative path from user's home (e.g.
./returned/
)
Is support for both path types something you would like to see in the PR prior to merging? One of the things I have not explored is what visibility we have into the CWD of a user's default home directory to construct a fully qualified path we might need.
IIRC there were some differences in how SFTP and FTP worked for these different relative vs. absolute path types as well. I had started trying to replicate our SFTP setup a bit more directly in the Docker Compose setup included but haven't looked at it in quite a while. If this is something you'd like to tackle now I can try and get a few hours over the next week or two to contribute an example.
FYI, I just got bit by this and our banking partner isn't going to change their directory structure to be unnested. Any idea if this is going any time soon?
Yea, I can release this today. It seems safe enough to deploy.
Changes
Adds in the ability to have a nested directory as the source of incoming files.
Why Are Changes Being Made
Currently when processing an
inbound
folder that is a nested directory (ex:"inbound": "/client_directory/with/subpath"
, the processing of the files fails due toprocessFile
being called with a directory instead of a specific file. This is not a fair assumption if theInboundPath()
,ReconciliationPath()
, etc. have any "depth" to them.In other places we make use explicitly of the agent's path information to determine what folders/files to operate on. Following a similar pattern here.
NOTE: The way that the audit folder path is constructed in
processFile
currently does not show the full/nested path but rather just the "last" directory in this hierarchy. If the full path would be preferred (debatable) can also work on making some of those changes.