Closed e-manduchi closed 10 years ago
These changes might do, but more testing adviced:
line 33: >=3 instead of >3 line 34: $a[3] instead of $a[4] line 51: $l[1] instead of $l[2]
Actually, the above correction will work only if the high.expressors file contains both the unique% and the %non-unique columns. If instead one had previously run runall_get_high_expressors with say the -u option, the high.expressors file contains fewer columns and the code needs to be adjusted. This would be a problem even in the original version of the code (using the gene symbols).
maybe pass optional parameters -u or -nu to this script too, so that it decides which column to use accordingly. If none of these parameters is passed in, then my correction above can be used. If either of these parameters is passed in, then
line 33: >=2 line 34: $a[2] line 51: $l[1] instead of $l[2]
e.g. add option -u or -nu (they will have the same effect, but just to keep same options used in unall_get_high_expressors).
Then define: $annot_column = 3; if (defined $ARGV[3] && ($ARGV[3] eq '-u' || $ARGV[3] eq '-nu')) { $annot_column = 2; }
then
former line 33: >=$annot_column former line 34: $a[$annot_column] former line 51: $l[1] instead of $l[2]
added -u and -nu options and modified the script so it will remove the exon if annotation is not available.
In some cases exons might have no gene symbol attached to them or 'n/a'. The piece of code between lines 44 and 66 will end up filtering out from the master exons list any exons with no gene symbol or 'n/a'. Note that gene symbol is an optional field in annotate.pl. Maybe flag should check the main identifier from annotate.pl (what is under the name column)?