sr320 / paper-pano-go

Draft manuscript describing Panopea gonad transcriptome
2 stars 7 forks source link

Perl command #17

Closed lafarga13 closed 7 years ago

lafarga13 commented 8 years ago

@sr320 Hi Steven For the CpG analyses, can you let me know the options to run this? I can't run the perl command in my windows, so I am stock and want to run this analyses by myself... can you help me? or let me know how I can length the sequences without the perl command?

!perl -e '$col = 2;' -e 'while (<>) { s/\r?\n//; @F = split /\t/, $; $len = length($F[$col]); print "$\t$len\n" } warn "\nAdded column with length of column $col for $. lines.\n\n";' \ analyses/Geoduck-transcriptome-v2.tab > analyses/Geoduck-transcriptome-v2-len.tab

sr320 commented 8 years ago

Here is Geoduck-transcriptome-v2-len.tab

http://owl.fish.washington.edu/halfshell/bu-git-repos/paper-pano-go/jupyter-nbs/analyses/Geoduck-transcriptome-v2-len.tab

I will also see if I can come up with a non perl solution.

sr320 commented 8 years ago

Ok here is a more elegant solution courtesy of https://github.com/stephenturner/oneliners

!cat Geoduck-transcriptome-v2.fasta | awk '$0 ~ ">" {print c; c=0;printf substr($0,2,100) "\t"; } $0 !~ ">" {c+=length($0);} END { print c; }' > NewTabfile

Will generate something like

comp7_c0_seq1 len=210 path=[5082:0-45 293:46-209]   210
comp30_c0_seq1 len=201 path=[6331:0-200]    201
comp35_c0_seq1 len=209 path=[12565:0-208]   209
comp36_c0_seq1 len=202 path=[13423:0-27 13870:28-114 13870:115-201] 202

IF you wanted a simpler file name could first do ...

!cut -d' ' -f1 ../data-results/Geoduck-transcriptome-v2.fasta > analyses/Geoduck-transcriptome-v2_IDonly.fasta

Thus only keeping comp7_c0_seq1 etc

lafarga13 commented 8 years ago

thanks... hope I can follow it... a lot of things going on finishing the year... sorry about the delays ;-(

lafarga13 commented 8 years ago

please steven do not close this issue cause I haven't be able to do anything... thanks

lafarga13 commented 8 years ago

@sr320 please steven do not close this issue cause I haven't be able to do anything... thanks Hope this week can make it finally...