ablab / stringdecomposer

Tool for decomposition centromeric assemblies and long reads into monomers
Other
34 stars 2 forks source link

SD--tsv #9

Open duhuipeng opened 4 years ago

duhuipeng commented 4 years ago

Dear author I'd like to ask you,I run through your code ,generates 3 tsv suffix files,as follow: image I'd like to ask which file I should mainly look at. What I' m looking at now is final_decomposition.tsv, This column of the document ,Because I want to predict SV now, I want to know if I can do it Looking forward to your reply

duhuipeng commented 4 years ago

Dear author Can you explain this sentence,I can not understand.why (i+1,j)represent insertion, (j+1,i)represent deletions, and so on image

seryrzu commented 4 years ago

Hi,

Thank you for your interest in String Decomposer! The final output of the tool is at final_decomposition.tsv.

Wrt your second question --- this is just how the graph is defined. It is pretty much analogous to the matrix alignment of two sequences (see for example, https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm).

Thanks, Andrey

duhuipeng commented 4 years ago

Dear author image image What I want to ask is that in this final_decomposition.tsv,I'm mainly looking at which column to see it structural variation? Is it the third column with letters? Looking forward to your reply Best

TanyaDvorkina commented 4 years ago

Hi!

Thank you again for your interest in StringDecomposer. File final_decomposition.tsv has the following columns (from left to right):

  1. Sequence name (usually read or assembly)
  2. Best aligned monomer name (it has ' at the end if the alignment is reverse complement)
  3. Alignment start position on sequence
  4. Alignment end position on sequence
  5. Alignment identity score
  6. Second best aligned monomer name
  7. Second best aligned monomer identity score
  8. Best aligned monomer name, if homopolymers collapsed (like GGGG -> G) in both sequences.
  9. Alignment score for the best monomer with collapsed homopolymers.
  10. Second best aligned monomer with collapsed homopolymers.
  11. And its score.

Sorry for late response!

Thank you, Tanya