Open nicolasfredesfranco opened 3 years ago
The 'exit' on the line 164 of feature.py produce a lot of problems. When there are more then 1 crop produce that the feature.sh doesn't save the numpy. The first time that use feature generation (second call of feature.py on feature.sh), it create the first .aln and then close the code. After that, It use the plmDCA to produce a .mat from the only .aln, but on the update read again the crops, and created the .aln of the second crops, and then exit, never saving the final npy. I believe that the best way to solved it is replacing the 'exit' for a 'continue', but this produce a aln for each crops and also a mat. With this change the second call of feature.py on feature.sh only produce the .aln of each crop, and the third call produce the entire numpy with the .mat of each crop on it. Is this correct? or Is supposed that the plmDCA be only apply over the .aln that correspond to the fasta of the original sequence/the full sequence?. If the last is the case it's necesary change the code to only produce one .mat, maybe retiring the 'exit' on feature.py but also replacing the second call of feature.py on feature.sh by another code that only produce the .aln of the main sequence. Please any comment is useful because I'm not totally sure of the right way to fix it but i understand the problem i maybe we can figure it out together.
aln, aln_id = read_aln(fas_file) aln = aln[:, aln[0] != '-'] write_aln(aln, aln_id, aln_file) continue
Have you solved it now?
The 'exit' on the line 164 of feature.py produce a lot of problems. When there are more then 1 crop produce that the feature.sh doesn't save the numpy. The first time that use feature generation (second call of feature.py on feature.sh), it create the first .aln and then close the code. After that, It use the plmDCA to produce a .mat from the only .aln, but on the update read again the crops, and created the .aln of the second crops, and then exit, never saving the final npy. I believe that the best way to solved it is replacing the 'exit' for a 'continue', but this produce a aln for each crops and also a mat. With this change the second call of feature.py on feature.sh only produce the .aln of each crop, and the third call produce the entire numpy with the .mat of each crop on it. Is this correct? or Is supposed that the plmDCA be only apply over the .aln that correspond to the fasta of the original sequence/the full sequence?. If the last is the case it's necesary change the code to only produce one .mat, maybe retiring the 'exit' on feature.py but also replacing the second call of feature.py on feature.sh by another code that only produce the .aln of the main sequence. Please any comment is useful because I'm not totally sure of the right way to fix it but i understand the problem i maybe we can figure it out together.
aln, aln_id = read_aln(fas_file) aln = aln[:, aln[0] != '-'] write_aln(aln, aln_id, aln_file) continue