Closed xiaocong3333 closed 4 years ago
Hi, You should be able to extract the TE counts by looking for lines where the feature (column 1) has colons in the name (e.g. L1HS:L1:LINE). In contrast, gene counts should not be using colons in their name (unless there are some unusual gene nomenclature in your model organism). If you are still having trouble, please feel free to send me a copy of your counts file. Thanks
Got it, Thank you! So if I need the TE abundance, I need to copy the TEs one by one? Is there another way to do that? I mean any options?
Hi,
If you are comfortable with the Unix/Linux command line, you can use the built-in grep
function:
grep ":" gene_TE_counts.txt > TE_counts.txt
grep -v ":" gene_TE_counts.txt > gene_counts.txt
If you are more comfortable with PowerShell (e.g. Windows), you can use Select-String
:
Select-String -Path gene_TE_counts.txt -Pattern ':' | Out-File -FilePath TE_counts.txt
Select-String -Path gene_TE_counts.txt -Pattern ':' -NotMatch | Out-File -FilePath gene_counts.txt
Hope this is helpful. Thanks
This is very helpful! Thank you!
The final result is gene_TE count, but sometimes, I just need TE counts or gene counts, how should I do?