cansyl / ECPred

GNU General Public License v3.0
15 stars 7 forks source link

ArrayIndexOutOfBoundsException when processing certain FASTA files #11

Open frikinzi opened 4 months ago

frikinzi commented 4 months ago

In some FASTA files I get this index out of range error:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 655
        at java.base/java.util.Vector.get(Vector.java:750)
        at runEC.predictions(runEC.java:124)
        at ECPred.main(ECPred.java:108)

I think the error happened in this line of code:

for (int j = 0; j < idlist.size(); j++)
        final_file.write(String.valueOf(df.format(Double.parseDouble(combined.get(j)))) + "\n");
final_file.close();

I think somehow idlist and combined were of different sizes. Changing for (int j = 0; j < idlist.size(); j++) to for (int j = 0; j < combined.size(); j++) caused it to run without problems, but I'm not sure if that would result in unexpected behavior.

I am using ECPred linked on the GitHub README page on a linux high performance computing system.

Thank you for looking into the issue. This was the FASTA where the error happened. GCA_000010725.1.faa_part_10.txt

alperendalkiran commented 4 months ago

Hi, Sorry for the late reply. I run your fasta file on my computer without any error. Here are the results: results_GCA_000010725.1.faa_part_10.txt

frikinzi commented 4 months ago

Is it possible to share what versions of the dependencies you're using? I'm still getting errors for some of the FASTA files on my HPC. It seems like it fails on certain sequences and then that causes an ArrayIndexOutOfBoundsException error. One example is:

GCA_000215745.1_03115 hypothetical protein MLMQIRTATRTLTAIRIATATRIRTVTPIRTATRIPTAIRILTATRIRTAIRILTVTLTRTVIRTPTVIPIRTATRIPTAIRIRTVTPIRIATAIRTLTAIRILTAIRILTAIRILTATRTLTATRTLTVIPIPTATRTLTATRTLTVIPIPTATRLGFGQRFLRLRQRLGFRQRFLRLRQRLGFRQRFRLRQRLGF

I'm running the weighted option.