Open caity-s opened 7 years ago
Hi @caity-s, @inodb I am trying use this tutorial for my gut shotgun metagenomic data and this is my first time. I saw the above comment for the RNA-seq, can i use tpm_table.py to calculate the gene for my data? if yes, calculated values are in TPM, later i can convert these values in relative abundance? Thanks
Hello,
This is a very helpful tool for processing htseq count data, and I am using it to get the TPM for some RNAseq counts after running htseq. However, I am not sure how the average read length affects the output, as when I change -i to 1, to 5 or 100000000 the files produced are always exactly the same.
I am just producing variations of the following: python tpm_table.py -n 20100900_E1D -c 20100900_E1D.count -i <(echo -e "20100900_E1D\t1") -l 647_dereplicated.genelengths | sort > 20100900_E1D_tpm.tsv
No error messages are produced unless I put the average read length as 0.
Maybe I have done something wrong? Or maybe the read length is not affecting low count numbers? The highest count for a gene_id I have is 603227 (with a corressponding TPM of 60025.2112), which is the same for files where I have specified an average read length of 1 or 10,000,000,000,000.
Thank you,
Caity