Similar genomes produced different results

chao77777777 commented 10 months ago

Hi， Thank you for this excellent software！ I had a question, I used a similar genome, but there was an error in the last part of the second genome. I don't know what caused this. 1.txt 2.txt My thanks, Wang

M-D75 commented 10 months ago

Hi,

Sorry for the late response. Do you have the following log files: log/RENAME.out and log/RENAME.err ?

The error occurred during the RENAME PSEUDO TE step. This step is only necessary if your TE database (DB) contains certain characters that could disrupt the pipeline's operation, such as ">", ":", "/", "\", "|". If these characters are not useful to you, you can remove them from your DB. This way, the RENAME PSEUDO TE step will not be carried out, and the pipeline will stop at the previous step, which builds the final files.

Thank you for the assistance you provide. Every reported issue helps us improve the tool.

Best regards, M-D

chao77777777 commented 10 months ago

Hi, Thanks for your reply. I do not have the log files you mentioned. The TE in my TElib is so named, for example:

TE_00000566#DNA/DTA CGTCGTCTCCATCGTCTCCGTCTGCAACTGCTGGTGAGAGCTTTCGACTTCGTTTTCGCTTGTTAATTTTGAAGCTCAATCTGATTCCCAACAGAACCCATCTCTGAAGCTGAAAGGAAAGCTTGCTTTCCTGTTTTTAATTTCGAAGCTAAAAGGCTTTTGAAGCTGAAAGGAAAGCTTGCTTTCCTGTTTTGCTTCCGAAGCTAAAAGGAAAGCTTGCTTTCCTGTTTTTAATTTCGAAGCAAAAGCAGCTGAAATGACT

If I keep the name, how does this error affect the result？ Best wishes, wang

M-D75 commented 10 months ago

Hi,

Thank you for informing me of the error you encountered when using our tool. After a thorough diagnosis, I have identified the source of the problem, which is linked to the WORK_DIRECTORY path. During the RENAME PSEUDO TE step, the pipeline naively assumed that the WORK_DIRECTORY would always be located in a sub-directory of the current directory. But on your second run this was not the case (WORK_DIRECTORY: ../work_directory_name).

An update has been made to correct the problem.

Previous Error:

## CHANGE DIRECTORY
old_path=`pwd`
basename=`basename {params.work_directory}`
cd {params.work_directory}/REPORT/mini_report/web/js/

##ERROR ${{old_path}}/${{basename}}
awk  'NR>1 {{ split($4, sp, "|"); print sp[1] }}' ${{old_path}}/${{basename}}/TE_INFOS.bed | sort -u > ${{old_path}}/${{basename}}/tmp_LIST_TE_TO_CHANGE.txt

Correction made:


## CHANGE DIRECTORY
old_path=`pwd`
basename=`basename {params.work_directory}`

awk  'NR>1 {{ split($4, sp, "|"); print sp[1] }}' {params.work_directory}/TE_INFOS.bed | sort -u > {params.work_directory}/tmp_LIST_TE_TO_CHANGE.txt
cd {params.work_directory}/REPORT/mini_report/web/js/

We encourage you to retrieve the update to avoid these issues. Additionally, during our tests, we noticed that some TEs were not renamed correctly on certain graphs. We apologize for this inconvenience and are working to resolve it.

Best regards, M-D

chao77777777 commented 10 months ago

Hi， I admire your serious attitude. I see the reason for the error. But since I'm not proficient in programming, I haven't found where to fix this problem. Can you tell me more about it? Best wishes, wang

M-D75 commented 10 months ago

Hi,

I'm sorry, I should have been more clear. I've made the necessary corrections. The update is now available on our Git repository.

To retrieve this update, please follow one of the methods below:

Open a command terminal, navigate to the TrEMOLO directory using:

cd /path/to/TrEMOLO/

Then, update using:

git pull

If you prefer a fresh start, you can clone the repository again with:

git clone https://github.com/DrosophilaGenomeEvolution/TrEMOLO.git

I apologize for any inconvenience caused.

Best regards, M-D

chao77777777 commented 10 months ago

HI， Thank you for your patient reply！ Best wishes, wang

DrosophilaGenomeEvolution / TrEMOLO

Similar genomes produced different results #12