zhpn1024 / ribotish

Ribo-seq TIS Hunter, predicting translation initiation sites and ORFs using riboseq data
http://dx.doi.org/10.1038/s41467-017-01981-8
GNU General Public License v3.0
25 stars 7 forks source link

Wrong CDS annotation #28

Open agsanbel opened 2 years ago

agsanbel commented 2 years ago

Hello,

I have ran:

ribotish predict -t TIS_bams_new/R3_ProAligned.sortedByCoord.out.bam -b CHX_bams_new/R3_ProAligned.sortedByCoord.out.bam -g gencode.v19.annotation.gtf -f GRCh37.p13.genome.fa -o pred.txt --framebest

The result is that (It is not finished):

No offset parameter file found for CHX_bams_new/R3_ProAligned.sortedByCoord.out.bam. Using default offset (12). Thu Sep 22 10:43:03 2022 Estimating TIS background parameters... Thu Sep 22 12:05:04 2022 Predicting... Wrong CDS annotation: ENSG00000189409.8 ENST00000472264.1 56 556 556 Wrong CDS annotation: ENSG00000116649.5 ENST00000487300.1 252 611 611 Wrong CDS annotation: ENSG00000219073.3 ENST00000374666.1 3 497 497 Wrong CDS annotation: ENSG00000158062.16 ENST00000374215.1 196 938 938 Wrong CDS annotation: ENSG00000090020.6 ENST00000374084.2 289 695 695 Wrong CDS annotation: ENSG00000142765.13 ENST00000473280.1 83 312 312 Wrong CDS annotation: ENSG00000116497.13 ENST00000482212.1 267 706 706 Wrong CDS annotation: ENSG00000116922.10 ENST00000486637.1 511 852 852 Wrong CDS annotation: ENSG00000066136.15 ENST00000531464.1 344 525 525 Wrong CDS annotation: ENSG00000117385.11 ENST00000372526.2 31 654 654 Wrong CDS annotation: ENSG00000186973.6 ENST00000409396.1 29 448 448 Wrong CDS annotation: ENSG00000079277.15 ENST00000496619.1 202 738 738 Wrong CDS annotation: ENSG00000132122.7 ENST00000371841.1 69 818 818 Wrong CDS annotation: ENSG00000085831.11 ENST00000411642.2 92 931 931 Wrong CDS annotation: ENSG00000134744.9 ENST00000484723.2 191 2910 2910 Wrong CDS annotation: ENSG00000116212.10 ENST00000371368.1 261 838 838 Wrong CDS annotation: ENSG00000125703.10 ENST00000371118.1 96 665 665 Wrong CDS annotation: ENSG00000177414.9 ENST00000371077.5 424 1193 1193 Wrong CDS annotation: ENSG00000116791.9 ENST00000370870.1 158 888 888 Wrong CDS annotation: ENSG00000137944.12 ENST00000370486.1 232 1019 1019 Wrong CDS annotation: ENSG00000184371.9 ENST00000357302.4 239 597 597 Wrong CDS annotation: ENSG00000007341.14 ENST00000369664.1 174 862 862 Wrong CDS annotation: ENSG00000134262.8 ENST00000369564.1 336 1225 1225 Wrong CDS annotation: ENSG00000163349.17 ENST00000503968.1 251 570 570 Wrong CDS annotation: ENSG00000163399.11 ENST00000369494.1 264 698 698 Wrong CDS annotation: ENSG00000143452.11 ENST00000368987.1 219 791 791 Wrong CDS annotation: ENSG00000143442.17 ENST00000533351.1 167 586 586 Wrong CDS annotation: ENSG00000143569.14 ENST00000368504.1 69 1120 1120 Wrong CDS annotation: ENSG00000132676.11 ENST00000471214.1 388 1287 1287 Wrong CDS annotation: ENSG00000143320.4 ENST00000368220.1 208 456 456 Wrong CDS annotation: ENSG00000249730.1 ENST00000504970.1 0 935 935 Wrong CDS annotation: ENSG00000116191.13 ENST00000324778.5 107 906 906 Wrong CDS annotation: ENSG00000162779.16 ENST00000509175.1 310 765 765 Wrong CDS annotation: ENSG00000135837.11 ENST00000357434.2 0 392 392 Wrong CDS annotation: ENSG00000116747.8 ENST00000506303.1 488 613 613 Wrong CDS annotation: ENSG00000081237.14 ENST00000367379.1 71 517 517 Wrong CDS annotation: ENSG00000117153.11 ENST00000367258.1 87 1033 1033 Wrong CDS annotation: ENSG00000143842.10 ENST00000525442.1 365 538 538 Wrong CDS annotation: ENSG00000117625.9 ENST00000533469.1 91 540 540 Wrong CDS annotation: ENSG00000162931.7 ENST00000479800.1 123 925 925 Wrong CDS annotation: ENSG00000086619.9 ENST00000366589.1 390 605 605 Wrong CDS annotation: ENSG00000116977.14 ENST00000481485.1 480 799 799 Wrong CDS annotation: ENSG00000172059.6 ENST00000401510.1 232 614 614 Wrong CDS annotation: ENSG00000138074.10 ENST00000401463.1 290 732 732 Wrong CDS annotation: ENSG00000189350.8 ENST00000401723.1 233 720 720 Wrong CDS annotation: ENSG00000055332.12 ENST00000390013.3 257 564 564 Wrong CDS annotation: ENSG00000115828.11 ENST00000404976.1 138 992 992 Wrong CDS annotation: ENSG00000162994.11 ENST00000403506.1 290 430 430 ....

I don't know why it takes so long and why that message. I think which is due to annotation file, but then what file I have to use?

thank you so much!!

zhpn1024 commented 2 years ago

Quality control step is not performed. The TIS background estimation takes some aditional time. You can use multiprocess parameters to speed up, and use '-v' option to see progress. The annotation file have some incomplete annotations. You can use newer versions, or just neglect these transcripts.

agsanbel commented 2 years ago

The problem was that I hadn't the quality control files in the same path, thank you very much!

Now I am trying to use multiprocess (-p) but I have this error:

'AssertionError: group argument must be None for now'

When I use single process that works but it take so long.

Thank you so much!

zhpn1024 commented 2 years ago

Please provide the details of this error.

agsanbel commented 2 years ago

ribotish predict -t TIS_bams_new/R3_ProAligned.sortedByCoord.out.bam -b CHX_bams_new/R3_ProAligned.sortedByCoord.out.bam -g gencode.v19.annotation.gtf -f GRCh37.p13.genome.fa -o pred.txt -p 4 -v

Fri Sep 23 10:14:15 2022 Loading genome... Fri Sep 23 10:14:15 2022 Estimating TIS background parameters... TIS background estimation result will be saved to tisBackground.txt Traceback (most recent call last): File "/Users/asanchezb/miniconda3/envs/ribotish/bin/ribotish", line 56, in main() File "/Users/asanchezb/miniconda3/envs/ribotish/bin/ribotish", line 34, in main commands[cmd].run(args) File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/site-packages/ribotish/run/predict.py", line 154, in run pool = MyPool(1) # This is for memory efficiency File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/pool.py", line 215, in init self._repopulate_pool() File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/pool.py", line 306, in _repopulate_pool return self._repopulate_pool_static(self._ctx, self.Process, File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/pool.py", line 322, in _repopulate_pool_static w = Process(ctx, target=worker, File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/process.py", line 82, in init assert group is None, 'group argument must be None for now' AssertionError: group argument must be None for now

zhpn1024 commented 2 years ago

Thank you. I think this error is related to the new version of multiprocessing. The code is different in my python-3.7.4 version.

agsanbel commented 2 years ago

Ok! I will try to change python version, thank you so much!