jpuritz / dDocent

a bash pipeline for RAD sequencing
ddocent.com
MIT License
52 stars 41 forks source link

Tutorial #16

Closed outpaddling closed 8 years ago

outpaddling commented 8 years ago

We've been going through the tutorial at https://github.com/jpuritz/dDocent/blob/master/tutorials/Reference%20Assembly%20Tutorial.md.

First, this is fabulous! Thanks for your efforts to create such a high-quality document and corresponding scripts...

We're running dDocent here on CentOS 6 and FreeBSD and we encountered a few portability issues with the tutorial on the FreeBSD side:

1) This is not an issue for someone who follows your instructions to the letter, but if someone tries to run your downloaded bash scripts as "./script.sh" instead of "bash script.sh", the #!/bin/bash will fail. Changing the shebang line to #!/usr/bin/env bash would make it 99% portable.

2) In generating rainbow.fasta, the "cat rbasm.out <(echo "E")" fails. I replaced this with the following:

echo "E" > endmarker cat rbasm.out endmarker |sed 's/[0-9]:[0-9]://g' | mawk ' {

3) Some of the scripts contain "awk" commands (not mawk) that depend on GNU extensions. If you use "gawk" in place of "awk", it should still work on Linux systems and will work on others that don't use GNU awk by default. On all the Linux systems I've used, awk is a link to gawk, so this change should not impact Linux users.

jpuritz commented 8 years ago

Thanks for your comments. I will certainly update the shebang to make things more portable.

2) will be an issue for folks running a shell other than BASH. I will put a note in there explaining since if you are running BASH, it's handy to use the subshell.

3) I have gone back and forth with this one. You're correct that awk needs to link to GNU awk, which it does on CentOS, RHEL, and Fedora based systems. However, not all systems have gawk installed by default either, so it's a pick your poison situation. I'll make some notes about this in the tutorials.

outpaddling commented 8 years ago

2) This actually failed on FreeBSD even using bash. Something to do with how ptys are handled differently.

3) How about something like this:

if ! awk --version | fgrep -v GNU; then
     awk=gawk
else
     awk=awk
fi

Later, back at the ranch...

 $awk args

I assume there's a reason you're not using mawk for these few instances.

Thanks,

 Jason
jpuritz commented 8 years ago

Resolved!

outpaddling commented 8 years ago

Thanks! FYI, I discovered that the sub-shell syntax will work on FreeBSD if fdescfs is mounted, but this patch will be helpful to those who aren't aware of this or are running certain other POSIX platforms.