arefeen / TAPAS

20 stars 4 forks source link

Using TAPAS with custom annotations #2

Open jankgithub opened 6 years ago

jankgithub commented 6 years ago

1

Hi, Is it possible to use gencode (v28) annotations with TAPAS? I made equivalent file in the same format you specified (refFlat) using UCSC Table browser but with ENSEMBL identifiers. It doesn't work unless I change gene identifiers into more NCBI-like -> NM_ENS.... Why does it work like this with custom annotations?

I enclose first lines of my input file: C1orf141 ENST00000371007 chr1 - 67092164 67231852 67093004 67127240 8 67092164,67095234,67096251,67115351,67125751,67127165,67131141,67231845, 67093604,67095421,67096321,67115464,67125909,67127257,67131227,67231852, C1orf141 ENST00000371006 chr1 - 67092175 67127261 67093004 67127240 6 67092175,67095234,67096251,67115351,67125751,67127165, 67093604,67095421,67096321,67115464,67125909,67127261, C1orf141 ENST00000475209 chr1 - 67092175 67127261 67093579 67127240 7 67092175,67096251,67103237,67111576,67115351,67125751,67127165, 67093604,67096321,67103382,67111644,67115464,67125909,67127261, C1orf141 ENST00000621590 chr1 - 67092396 67127261 67096311 67127240 3 67092396,67125751,67127165, 67096321,67125909,67127261, PKP1 ENST00000263946 chr1 + 201283451 201332993 201283702 201328836 15 201283451,201293941,201313165,201316552,201317571,201318617,201319815,201320266,201321977,201323012,201324427,201324940,201325753,201328761,201330073, 201283904,201294045,201313560,201316697,201317779,201318795,201319878,201320381,201322133,201323189,201324581,201325127,201325838,201328868,201332993,

Thank you in advance. Jan

arefeen commented 6 years ago

Hi,

Thanks a lot for using TAPAS. If the issue is with the second column then you can use awk to add "NM_" to all the rows and run the tool. The output result does not contain the second column information.

On Tue, Nov 20, 2018 at 9:19 AM jankgithub notifications@github.com wrote:

1 https://github.com/arefeen/TAPAS/issues/1

Hi, Is it possible to use gencode (v28) annotations with TAPAS? I made equivalent file in the same format you specified (refFlat) using UCSC Table browser but with ENSEMBL identifiers. It doesn't work unless I change gene identifiers into more NCBI-like -> NM_ENS.... Why does it work like this with custom annotations?

I enclose first lines of my input file: C1orf141 ENST00000371007 chr1 - 67092164 67231852 67093004 67127240 8 67092164,67095234,67096251,67115351,67125751,67127165,67131141,67231845, 67093604,67095421,67096321,67115464,67125909,67127257,67131227,67231852, C1orf141 ENST00000371006 chr1 - 67092175 67127261 67093004 67127240 6 67092175,67095234,67096251,67115351,67125751,67127165, 67093604,67095421,67096321,67115464,67125909,67127261, C1orf141 ENST00000475209 chr1 - 67092175 67127261 67093579 67127240 7 67092175,67096251,67103237,67111576,67115351,67125751,67127165, 67093604,67096321,67103382,67111644,67115464,67125909,67127261, C1orf141 ENST00000621590 chr1 - 67092396 67127261 67096311 67127240 3 67092396,67125751,67127165, 67096321,67125909,67127261, PKP1 ENST00000263946 chr1 + 201283451 201332993 201283702 201328836 15 201283451,201293941,201313165,201316552,201317571,201318617,201319815,201320266,201321977,201323012,201324427,201324940,201325753,201328761,201330073, 201283904,201294045,201313560,201316697,201317779,201318795,201319878,201320381,201322133,201323189,201324581,201325127,201325838,201328868,201332993,

Thank you in advance. Jan

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arefeen/TAPAS/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AJpQSLAEwyxOPlxtTniPDkSpE3plpfiMks5uxDmrgaJpZM4YrkuK .

-- Thanks & Regards,

Ashraful Arefeen Email: ashraful.arefeen@csebuet.org

arefeen commented 6 years ago

I have run the tool with your annotation input. It completes without error. Therefore, the tool is not sensitive with the naming. If there is no read coverage in the 3' UTRs of the gene then the tool outputs nothing.

On Tue, Nov 20, 2018 at 11:12 AM Ashraful Arefeen < ashraful.arefeen@csebuet.org> wrote:

Hi,

Thanks a lot for using TAPAS. If the issue is with the second column then you can use awk to add "NM_" to all the rows and run the tool. The output result does not contain the second column information.

On Tue, Nov 20, 2018 at 9:19 AM jankgithub notifications@github.com wrote:

1 https://github.com/arefeen/TAPAS/issues/1

Hi, Is it possible to use gencode (v28) annotations with TAPAS? I made equivalent file in the same format you specified (refFlat) using UCSC Table browser but with ENSEMBL identifiers. It doesn't work unless I change gene identifiers into more NCBI-like -> NM_ENS.... Why does it work like this with custom annotations?

I enclose first lines of my input file: C1orf141 ENST00000371007 chr1 - 67092164 67231852 67093004 67127240 8 67092164,67095234,67096251,67115351,67125751,67127165,67131141,67231845, 67093604,67095421,67096321,67115464,67125909,67127257,67131227,67231852, C1orf141 ENST00000371006 chr1 - 67092175 67127261 67093004 67127240 6 67092175,67095234,67096251,67115351,67125751,67127165, 67093604,67095421,67096321,67115464,67125909,67127261, C1orf141 ENST00000475209 chr1 - 67092175 67127261 67093579 67127240 7 67092175,67096251,67103237,67111576,67115351,67125751,67127165, 67093604,67096321,67103382,67111644,67115464,67125909,67127261, C1orf141 ENST00000621590 chr1 - 67092396 67127261 67096311 67127240 3 67092396,67125751,67127165, 67096321,67125909,67127261, PKP1 ENST00000263946 chr1 + 201283451 201332993 201283702 201328836 15 201283451,201293941,201313165,201316552,201317571,201318617,201319815,201320266,201321977,201323012,201324427,201324940,201325753,201328761,201330073, 201283904,201294045,201313560,201316697,201317779,201318795,201319878,201320381,201322133,201323189,201324581,201325127,201325838,201328868,201332993,

Thank you in advance. Jan

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arefeen/TAPAS/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/AJpQSLAEwyxOPlxtTniPDkSpE3plpfiMks5uxDmrgaJpZM4YrkuK .

-- Thanks & Regards,

Ashraful Arefeen Email: ashraful.arefeen@csebuet.org

-- Thanks & Regards,

Ashraful Arefeen Email: ashraful.arefeen@csebuet.org

jankgithub commented 6 years ago

It seems to be the case with identifiers. When they start with NM_ TAPAS works fine if they don't there is no output at all.

In case someone would find it helpful I provide two files I used during analysis (one with NM and one without)

gencode_v28_with_gene_id_and_NM.txt gencode_v28_with_gene_id.txt

And my output (using the file with NM_ - like identifiers).

output.txt

Pipeline with the other input annotations dosn't produce output and also there is no error.

You may want to inspect this in the future. Thank you for your help.

Best, Jan

ShenTTT commented 3 years ago

Yes, adding NM_ solved my problem #18