Closed Yutang-ETH closed 2 years ago
The number before H could be very large. It is OK to have them in the sam file. But it would be problematic to convert those sam files into bam or cram format. There are some limitations with those large numbers in binary formats.
Thank you very much Baoxing for your explanation. Yesterday I tried to convert sam to bam without removing the H and it didn't work as you said.
By the way, I think converting maf to sam then to paf is not a good idea, because the coordinate of query (the global chain of query coordinates) was lost after converting sam to paf, I think this is because sam doesn't store the global chain coordinate for the query. However, I found a solution to this by swapping query and reference, then converting swap_maf to sam then to paf, the reference coordinates in the swap paf are the query coordinates in the non-swap paf.
Best wishes, Yutang
Hi Yutang,
Are you test paftools.js to convert Sam to paf? Does it have correct result?
Shenglong
Hi Shenglong,
Thank you very much for asking. Yes, I used paftools.js to convert sam to paf, it returns some warnings, however, the resulting paf seems correct. I actually don't understand the warning message, but paftools.js finished anyway.
Best wishes, Yutang
I think I have the similar message, the warning looks like about large “H”. What do you mean the swap mentioned above? If I use paftools, do I still need to consider this?
I also guess the warning message is related to "H".
Let's say you align A to B, A is ref and B is query, when you convert maf to sam, then only the coordinate of the ref is retained in sam, the coordinate of B is lost. However, in my case, I also need the coordinate of B, so what I did is swap ref and query in the maf using the python script provided by anchorwave, now B is ref and A is query in the swapped maf, then I converted this swapped maf to sam, the coordinate in sam is B's. I hope this is clear to you.
If only the ref coordinate is needed for you, you don't need to do what I did.
Best wishes, Yutang
Clear explanation! Thanks for your kind answer.
No problem, have fun.
Best wishes, Yutang
Hi Baoxing,
First of all, congratulations on your new position at Peking University.
My question is regarding this line: python2 maf-convert sam anchorwave.maf | sed 's/[0-9]+H//g' > anchorwave.sam
Why do you substitute any H in sam? Does this mean you remove any H (hard clip) from the cigar string in sam? Could you please shed some light on this? Thank you very much.
Best wishes, Yutang