NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
431 stars 52 forks source link

agat_sp_fix_longest_ORF.pl default codon table #420

Closed jvolkening closed 4 days ago

jvolkening commented 4 months ago

This isn't really a bug or a feature request, and is related to #219. It took me a while to figure out this behavior (default table 1 allowed start codons are AUG, UUG and CUG) and that there was a codon table "0" which forced AUG start. I was going to open a PR with some changes to the POD and other docs to more clearly describe this situation and the use of table 0. I have to imagine that if other people are using this tool, many of them would expect the behavior of table 0 by default; as mentioned in #219, while the alternative start codons are found, they are generally the exception rather than the rule.

Then I saw this inline comment in the code:

#codontable_id by default=0 strict M as start codon

and it made me wonder: the default table is actually 1, but maybe it should be 0 or was intended to be 0 at some point? I would speculate that this is what many people would expect. On the other hand, this would be a breaking change as it would significantly change the default behavior. I wanted to bring this up as an issue to see if the authors had any insight before I spent time on a PR to update the docs.

Thanks.

Juke34 commented 4 months ago

Hi, I think I set All scripts using codontable with value 1 by default. The table 0 is specific to bioperl, but was even bugged (unusable) in many version of bioperl. I have fixed it recent versions. So table 0 is something really specific. The comment you have seen I think was to tell that the default behavior of the function "translate" if we do not provide any codontable will be 0 ( but using a bioperl version fixed otherwise it will be anyway version 1 used). Anyway, I do not use the 'translate' function with default value. I set in AGAT the codon table to 1 by default, so the translate function will use this value excepted if you modify the codontable value with the appropriate parameters.

jvolkening commented 4 months ago

Thanks...I see that there are good reasons to leave the default as is. Would you be open to some additions to the documentation on this if I open a PR? I still think the default behavior will catch some people by surprise. I know it took me a while to figure out (1) why it was moving some of the start codons in my GFF3 to upstream non-standard codons, and (2) how to change this behavior to what I needed. In the end the fix was easy, but finding it documented wasn't.

Is the code using only the codon triplet to choose which start codon to use, or does it take into account the strongest surrounding context (e.g. Kozak consensus)? Personally, I would only want to allow alternative start codons when the full context was taken into account.

Thanks again.

Juke34 commented 4 months ago

Yes sure improving the doc is always a good idea ^^. We could provide a link to the NCBI codon tables, explain what is the table 0, etc. Go ahead I will review your PR. Thanks

123jjhy commented 1 week ago

Hi, table 0 seems to be invalid, I would like to ask you how to use it.

MSG: Your version of bioperl do not handle codon table 0 It uses codon table 1 instead.

Codon table 1 in use. You can change it using --table option.

Thanks.

Juke34 commented 6 days ago

You must update the bioperl version to the most recent one, this is the only way to be able to use table 0. (or maybe use an very old version?). To use the most up to date check the path where is bioperl. Agat tell you at the beginning of the log:

 ------------------------------------------------------------------------------
|   Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0                      |
|   https://github.com/NBISweden/AGAT                                          |
|   National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se         |
 ------------------------------------------------------------------------------
=> Using agat_config.yaml config file found in your working directory.

                          ------ Start parsing ------                           
-------------------------- parse options and metadata --------------------------
=> Accessing the feature_levels YAML file
Using standard /usr/local/lib/perl5/site_perl/auto/share/dist/AGAT/feature_levels.yaml file

So in that case the lib is here: /usr/local/lib/perl5/site_perl/ Then at this location you can copy paste the file and the folder you can find here: https://github.com/bioperl/bioperl-live/tree/master/lib

123jjhy commented 6 days ago

Hi, I've replaced the lib of /usr/local/lib/perl5/site_perl/ with https://github.com/bioperl/bioperl-live/tree/master/lib,but I still couldn't call the option of --table 0. there is still a WARNING: MSG: Your version of bioperl do not handle codon table 0 It uses codon table 1 instead. image image

Juke34 commented 6 days ago

What AGAT version do you use?

123jjhy commented 6 days ago

------------------ 原始邮件 ------------------ 发件人: "NBISweden/AGAT" @.>; 发送时间: 2024年6月27日(星期四) 凌晨0:57 @.>; @.**@.>; 主题: Re: [NBISweden/AGAT] agat_sp_fix_longest_ORF.pl default codon table (Issue #420)

What AGAT version do you use?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Juke34 commented 6 days ago

I will investigate this

Juke34 commented 5 days ago

Right there is still a bug in my code. I will push a fix. If you want to fix it yourself replace the get_proper_codon_table function in lib/AGAT/Utilities.pm by

sub get_proper_codon_table {
  my ($codon_table_id_original) = @_;
  my $codonTable = Bio::Tools::CodonTable->new( -id => $codon_table_id_original);
  my $codon_table_id_bioperl = $codonTable->id;

  if (! defined($codon_table_id_bioperl)){
    $codon_table_id_bioperl = 1 ; # default codon table
  }

  if ($codon_table_id_original == 0 and  $codon_table_id_original != $codon_table_id_bioperl){
    $codonTable->warn("Your version of bioperl do not handle codon table 0\n".
    "see https://github.com/bioperl/bioperl-live/pull/315\n".
    "It uses codon table $codon_table_id_bioperl instead.");
  }

  print "Codon table ".$codon_table_id_bioperl." in use. You can change it using the appropriate parameter.\n";
  return $codon_table_id_bioperl;
}
123jjhy commented 5 days ago

It is working now. Thank you for your patience!