Closed MichelMoser closed 9 years ago
Column 1 is the A read, column 2 the B read, and column 3 the orientation (c for complement, n for normal) e.g. read 1 has a local alignment with the complement of read 128, read 1 has a local alignment with read 291, read 1 has an LA with the complement of read 386, etc. This is also described in my blog post:
Hoped that helped, Gene
On 10/12/15, 7:52 AM, MichelMoser wrote:
Dear Gene,
I would like to extract reads (based on their name) which mapped to a reference genome. Is it possible to extract the read names instead of their index numbers (second column in the .las file?) with LAshow? Or could i just parse LAshow output [2nd column] and pipe it to DBshow with the -n option? This works but i am not sure if i get the correct results.
Another thing which was bugging me were the columns 1 and 3 from the .las output. Could you tell me what the integer at col_1 and 'n' or 'c' at col_3 stand for?
|paxi_mt.50smrtex: 420,488 records col: [1] [2 ][3] [ 4 ] [ 5 ] [ 6 ] 1 128 c [ 21,982.. 23,180] x [ 7,448.. 8,710] : < 125 diffs ( 13 trace pts) 1 291 n [ 22,018.. 23,180] x [ 3,095.. 4,380] : < 185 diffs ( 12 trace pts) 1 386 c [ 21,969.. 23,167] x [ 4,940.. 6,179] : < 138 diffs ( 13 trace pts) 1 463 n [ 43,537.. 45,191] x [ 323.. 2,249] : < 327 diffs ( 17 trace pts) 1 711 n [ 21,980.. 23,169] x [ 4,799.. 6,069] : < 173 diffs ( 13 trace pts) 1 775 c [ 21,976.. 23,180] x [11,411..12,648] : < 103 diffs ( 13 trace pts) 1 785 c [ 21,968.. 23,163] x [ 1,012.. 2,214] : < 107 diffs ( 13 trace pts)
Thank you, Michel
— Reply to this email directly or view it on GitHub https://github.com/thegenemyers/DALIGNER/issues/28.
Thank you. So to extract from .las the read names from B i could indeed do something like:
LAshow A.dam B.db A_B.las |sed 's/\s\+/\t/' - | awk 'NR > 2 {print $2}' | DBhow -n B.db - > B_read_names_which_match_A.txt
I didn't understand the pipe, but it is correct that DBshow -n will give you the headers for each read index. So definitely the correct idea. -- Gene
On 10/13/15, 10:02 AM, MichelMoser wrote:
Thank you. So to extract from .las the read names from B i could indeed do something like:
LAshow A.dam B.db A_B.las sed 's/\s+/\t/' - awk 'NR > 2 {print $2}' DBhow -n B.db - > B_read_names_which_match_A.txt — Reply to this email directly or view it on GitHub https://github.com/thegenemyers/DALIGNER/issues/28#issuecomment-147638232.
Dear Gene,
I would like to extract reads (based on their name) which mapped to a reference genome. Is it possible to extract the read names instead of their index numbers (second column in the .las file?) with LAshow? Or could i just parse LAshow output [2nd column] and pipe it to DBshow with the -n option? This works but i am not sure if i get the correct results.
Another thing which was bugging me were the columns 1 and 3 from the .las output. Could you tell me what the integer at col_1 and 'n' or 'c' at col_3 stand for?
Thank you, Michel