Closed JamesRH closed 11 years ago
I fixed part of this problem (with the incorrect column numbers). What did we decide, should we limit this to just one cluster\run pair?
FYI - you called this function weirdly which was what caused the exact error you obtained (there was also a separate problem with column numbers that created an invalid fasta file) - the cluster and run ID should be in separate columns on input. So I recommend doing this:
makeTabDelimitedRow.py "all_I_1.7_c_0.1_m_maxbit" 805 | db_makeClusterAlignment.py -m mafft_default --notrim
Things like this are why I created the makeTabDelimitedRow.py function.
Yes, I agreed that your original design was the "least surprise". One cluster pair is fine.
Perhaps it makes more sense as a command-line arg (it makes sense to separate multi-line input as a pipe and single variables as arguments I think, they can always be multiplexed in a wrapper scrip or one of my xargs monstrosities).
James H.
I can not tell you how many times I needed to do that today.
Now I know the function exists.
I really need to set aside an hour and read the documentation.
Here are two other ugly hacks. Do you know a better shell or do you have a program to do this?
1) I do this sort of thing when I need to reverse the columns (cut -f 3,1 gives the same output as cut -f 1,3, but I want it to be in the other order).
cat all* | cut -f 1 > tmp1; cat all* | cut -f 8 > tmp2; paste tmp2 tmp1
out;rm tmp2 tmp1
is there a better way?
2) Adding up one (or more) columns:
cat in|cut -f 1| paste -sd+ |bc
Have a good weekend, James H.
On 11/30/2012 05:25 PM, mattb112885 wrote:
FYI - you called this function weirdly which was what caused the exact error you obtained (there was also a separate problem with column numbers that created an invalid fasta file) - the cluster and run ID should be in separate columns on input. So I recommend doing this:
makeTabDelimitedRow.py "all_I_1.7_c_0.1_m_maxbit" 805 | db_makeClusterAlignment.py -m mafft_default --notrim
Things like this are why I created the makeTabDelimitedRow.py function.
— Reply to this email directly or view it on GitHub https://github.com/mattb112885/clusterDbAnalysis/issues/23#issuecomment-10908263.
Sorry, my bug report sucked, I was trying to simplify it and ended up submitting it that way. In my code I was piping from a string that did pass it separated by a tab.
Sorry, James H
On 11/30/2012 05:25 PM, mattb112885 wrote:
FYI - you called this function weirdly which was what caused the exact error you obtained (there was also a separate problem with column numbers that created an invalid fasta file) - the cluster and run ID should be in separate columns on input. So I recommend doing this:
makeTabDelimitedRow.py "all_I_1.7_c_0.1_m_maxbit" 805 | db_makeClusterAlignment.py -m mafft_default --notrim
Things like this are why I created the makeTabDelimitedRow.py function.
— Reply to this email directly or view it on GitHub https://github.com/mattb112885/clusterDbAnalysis/issues/23#issuecomment-10908263.
OK - I limited it to one for now but I agree it would make more sense as a command line arg (or with both options). I'm going to close this and file a new bug to that effect.
Around line 71 db_getClusterGeneInformation.py calls:
These column options should be removed (the defualt is, I thik, right) or should be -g 1 -a 5 -s 12
Bug reproduction:
In the past I made these changes to db_makeClusterAlignment.py, as we discussed, I was making it multiplex many cluster runs as well as fixing this bug, which is not necessary.