Closed duartemolha closed 4 years ago
the region I ask this is because some of one of the lines I got was this: 22,38285985;0,166400969|22,38285985 22,38286465;9,129780021|9,129780042|9,129780063|9,129780084|9,129780105|9,129780126|22,38286444|22,38286465
what does this mean exactly?
Good point! I added a section to the wiki explaning the csv format.
Feel free to reopen the issue if there are any questions left!
Would it be possible to use the name of the sequence from the fasta file instead of it's number?
It would be much helpful if it said chr1 instead of 0 and chr2 instead of 1 , etc...
for the example I gave above I am assuming : this: 22,38285985;0,166400969|22,38285985
corresponds to the 23rd sequence in the indexed fasta file (chrX)
so chrX,38285985;chr1,166400969|chrX,38285985
correct?
Unfortunately that would bloat up the csv file even more. If you still have the fasta file laying around, you can just replace it with a one-liner in awk
awk 'BEGIN{id = 0} FNR==NR{ if ($0 ~ /^>/) { gsub(/>/, "", $0); f[id++] = $0; } next; } { for (i in f) { pattern = "((^)|(;)|(|))" i ","; $0 = gensub(pattern, "\\1" f[i] ",", "g"); } print $0 }' genome.fa genome.genmap.csv > new.csv
Ok.. thanks... That is pretty much what I had done :)
On Fri, 20 Dec 2019, 14:14 cpockrandt, notifications@github.com wrote:
Unfortunately that would bloat up the csv file even more. If you still have the fasta file laying around, you can just replace it with a one-liner in awk
awk 'BEGIN{id = 0} FNR==NR{ if ($0 ~ /^>/) { gsub(/>/, "", $0); f[id++] = $0; } next; } { for (i in f) { pattern = "((^)|(;)|(|))" i ","; $0 = gensub(pattern, "\1" f[i] ",", "g"); } print $0 }' genome.fa genome.genmap.csv > new.csv
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cpockrandt/genmap/issues/15?email_source=notifications&email_token=AAFQIVI5NZPZC37MF3FZRKTQZTHLZA5CNFSM4J4IRLI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHNA2VI#issuecomment-567938389, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFQIVIDY5PQM7NHBPQDUL3QZTHLZANCNFSM4J4IRLIQ .
I was trying to find in the help documentation , but failed to do so.
Can you please explain the output of the CSV file when using -d option?
Thanks
Duarte