vlothec / TRASH

RepeatIdentifier
MIT License
49 stars 3 forks source link

Coordination discrepancy in output CSV #20

Open justdx opened 7 months ago

justdx commented 7 months ago

Firstly, I would like to express my gratitude for the development of TRASH. It has proven to be a valuable tool for annotating centromeres. However, upon reviewing my running results, I noticed an issue in the output CSV file. The coordinate positions of the repeat did not align with the sequence; there was a one-base shift to the right.

Additionally, when annotating HOR, I encountered an unexpected problem. During the first run, despite specifying the parameter "--horclass" and providing the corresponding CSV file, an error consistently occurred, indicating that the alignment file could not be found. Consequently, I had to execute the tool a second time, this time using the parameter "--horonly". I am uncertain about the cause of this issue and would appreciate any insights you can provide.

junhaiqi commented 7 months ago

Hello,I don't understand how to identify HOR by TRASH. Can you give a command line example? Thank you very much!

justdx commented 7 months ago

Here are the scripts that I used to run TRASH. It is suggested to set thread (--par) as the number of chromosomes: /TRASH/TRASH_run.sh assembly.fa --seqt CEN178.csv --horclass CEN178 --par 5 --o ./ /TRASH/TRASH_run.sh assembly.fa --seqt CEN178.csv --horclass CEN178 --par 5 --horonly --o ./

and below is content of "CEN178.csv": name,length,seq CEN178,178,AGTATAAGAACTTAAACCGCAACCGATCTTAAAAGCCTAAGTAGTGTTTCCTTGTTAGAAGACACAAAGCCAAAGACTCATATGGACTTTGGCTACACCATGAAAGCTTTGAGAAGCAAGAAGAAGGTTGGTTAGTGTTTTGGAGTCGAATATGACTTGATGTCATGTGTATGATTG

junhaiqi commented 6 months ago

I see,thank you very much!