itmat / Normalization

RNA-Seq normalization and quantification pipeline
https://github.com/itmat/normalization/wiki
GNU General Public License v3.0
10 stars 5 forks source link

problem with filter_high_expressors.pl #24

Closed e-manduchi closed 10 years ago

e-manduchi commented 10 years ago

In some cases exons might have no gene symbol attached to them or 'n/a'. The piece of code between lines 44 and 66 will end up filtering out from the master exons list any exons with no gene symbol or 'n/a'. Note that gene symbol is an optional field in annotate.pl. Maybe flag should check the main identifier from annotate.pl (what is under the name column)?

e-manduchi commented 10 years ago

These changes might do, but more testing adviced:

line 33: >=3 instead of >3 line 34: $a[3] instead of $a[4] line 51: $l[1] instead of $l[2]

e-manduchi commented 10 years ago

Actually, the above correction will work only if the high.expressors file contains both the unique% and the %non-unique columns. If instead one had previously run runall_get_high_expressors with say the -u option, the high.expressors file contains fewer columns and the code needs to be adjusted. This would be a problem even in the original version of the code (using the gene symbols).

e-manduchi commented 10 years ago

maybe pass optional parameters -u or -nu to this script too, so that it decides which column to use accordingly. If none of these parameters is passed in, then my correction above can be used. If either of these parameters is passed in, then

line 33: >=2 line 34: $a[2] line 51: $l[1] instead of $l[2]

e-manduchi commented 10 years ago

e.g. add option -u or -nu (they will have the same effect, but just to keep same options used in unall_get_high_expressors).

Then define: $annot_column = 3; if (defined $ARGV[3] && ($ARGV[3] eq '-u' || $ARGV[3] eq '-nu')) { $annot_column = 2; }

then

former line 33: >=$annot_column former line 34: $a[$annot_column] former line 51: $l[1] instead of $l[2]

eunjijunekim commented 10 years ago

added -u and -nu options and modified the script so it will remove the exon if annotation is not available.