datacarpentry / shell-genomics

Introduction to the Command Line for Genomics
https://datacarpentry.org/shell-genomics
Other
62 stars 189 forks source link

sftp transfer software #290

Open sstevens2 opened 4 years ago

sstevens2 commented 4 years ago

In the fastqc part of the data wrangling we have folks download data (and learners usually want to know how to download things anyway). I'm wondering if there is an easy cross-platform software we can have them install. I really like cyberduck for example. Thoughts?

ErinBecker commented 4 years ago

Thanks for putting in this issue @sstevens2! The Maintainers for this lesson will be meeting on Friday and will try to offer some suggestions here. I'd love to also get input from the @datacarpentry/wrangling-genomics-maintainers, as they may be more familiar with different technologies and how they would fit into the scope of the lesson.

aschuerch commented 4 years ago

Note that the intro to 'Moving and Downloading Data' is in the shell-genomics lesson https://datacarpentry.org/shell-genomics/05-writing-scripts/index.html I would highly welcome an easy cross-platform software as suggested by @sstevens2 , as teaching this part is challenging now with the different solutions offered for Windows and Unix.

sstevens2 commented 4 years ago

@aschuerch I was a bit unsure where to put this issue. I thought of it while proofing the setup section recently but there are downloading file sections in both the shell lesson and data-wrangling fastqc episode. I'm happy to move it to the shell lesson if you'd prefer it be there.

Does anyone have a good idea of what other SFTP GUI options there are? I'll do a little research to see if I can find other software but if you know of some please add to this issue.

I like Cyberduck because I've used it for awhile and it does seem to work on both Win/Mac though I've only used the Mac version myself. Not a good solution for linux though could cover most of our learners. It is free though they would like to buy/donate as it helps support development.

sstevens2 commented 4 years ago

Looks like FileZilla works on all OS and is free. I thought it used to be in the lesson somewhere but maybe I'm mistaken. Might be biased because I've used cyberduck for so long but I didn't find it as easy to use as cyberduck. Might still be a good solution though.

ErinBecker commented 4 years ago

@sstevens2 - I think we had FileZilla in the lesson for a while but removed it. I don't remember why. Maybe one of the @datacarpentry/shell-genomics-maintainers can share some history and what if any discussions are ongoing about file transfer software?

I do think this issue belongs more in the individual lesson repos, however. Would you mind moving it over?

aschuerch commented 4 years ago

I am also not sure why FileZilla was removed, and I sometimes still teach FileZilla instead. Maybe the @datacarpentry/cloud-genomics-maintainers know because that's where moving/copying was located first (see https://github.com/datacarpentry/shell-genomics/issues/207).

jrherr commented 4 years ago

I'm late to the party as usual... From my memory, when we sat down at the workshop at CSHL where we initiated these teaching materials there was an emphasis on providing data transfer methods that were OS independent. We focused on scp but also included FileZilla and CyberDuck. There was a discussion to just focus on scp as this would be independent of outside programs, so I'm not sure if that was why FileZilla was removed subsequently. In my courses I teach scp but make students aware of other options for file transfer.

jsgro commented 3 years ago

FileZilla: there has been a lot of reports about security risks (easy google search) and that is probably the reason it was removed from use in these lessons.
scp has the advantage to easily transfer folders and contents. I find it harder to use in teaching as it requires password at every transfer (as far as I saw it demonstrated.)
I always teach sftp as it is available by default on all systems, including within PowerShell on Windows 10 (a far as I can test on my system.) Bottom-line: I always teach at least one method that is not GUI and that I think is rather "standard" by command-line with the idea/hope that this is present by default on all systems likely to be used on computersfor Next Gen or Genomics.

sstevens2 commented 3 years ago

I will say this wasn't a big issue when I last taught it because we had them all install git bash so the scp was more or less the same between windows and mac/linux. Did have a small issue with zsh users on mac but this has a PR to address it already.

bkmgit commented 1 year ago

Would be good to use scp as this is cross platform.

bkmgit commented 1 year ago

Related issue https://github.com/datacarpentry/shell-genomics/issues/236

bkmgit commented 1 year ago

scp is given as an option.

bkmgit commented 10 months ago

sftp is a little easier to use as one can navigate to the file before transferring it. However, scp is easier to automate in scripts. Tools such as Globus may also be worth mentioning if they are available as some of the genomics datasets are large.