Open ghost opened 3 years ago
Your connection is timing out. Connecting is not being established with your sftp client. Check that your security groups/nacls allow inbound/outbound connections. Also check that your sftp client allows connections to it as well.
Actually I am doing this as a part of my college project and I am new to this aws glue and networking. Is there any chance you can help me on this or any tutorial or something that already there which can help me. I am asking this since I haven't found any proper resource for SFTP to S3 using aws Glue apart from your repository and another one quite similar to this.
While creating a glue job I have not selected any vpc, so how can I configure the security groups and also how can we select a vpc while creating a glue job? Don't take me wrong for asking so many questions. If possible please connect me on my email- nani.veeru.9999@gmail.com
you should check out my articles on aws glue and data ingestion if you haven't already. for aws glue: https://towardsdatascience.com/extract-transform-load-etl-aws-glue-edd383218cfd for data ingestion: https://towardsdatascience.com/datalake-file-ingestion-from-ftp-to-aws-s3-253022ae54d4
Actually, I have gone through them but I haven't found the required fix for my issue. I will define my problem here see if you can get any thing from it. Since this is for testing I have downloaded the Rebex Buru SFTP server and installed it on my system and created a user and added some files to it. Now I am trying to copy these files to my S3 bucket using glue. Initially I got some paramiko import errors but I was able to resolve them. Using this glue job when I put parameters https://test.rebex.net/ of sftp rebex test server I was able to connect and copy the files from that server. But when I put the details of the server I installed on my laptop it is throwing the above error. Can you help me here?
what ftp server are you using on your laptop? how about try using Filezilla?
I will try that and update you
Traceback (most recent call last): File "/tmp/runscript.py", line 211, in
runpy.run_path(temp_file_path, run_name='main')
File "/usr/local/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/local/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/tmp/glue-python-scripts-mjxcjss5/glue-sftp-ingestion.py", line 27, in
File "/glue/lib/installation/paramiko/client.py", line 349, in connect
retry_on_signal(lambda: sock.connect(addr))
File "/glue/lib/installation/paramiko/util.py", line 283, in retry_on_signal
return function()
File "/glue/lib/installation/paramiko/client.py", line 349, in
retry_on_signal(lambda: sock.connect(addr))
TimeoutError: [Errno 110] Connection timed out
I am running the job on aws glue python shell. Can anyone help me resolve the issue.