chpngyu / chip-seq-pipeline

Computational pipeline of ENCODE ChIP-seq analysis
7 stars 4 forks source link

Cobinding and tethered binding motifs #1

Open zxwang-cloak opened 1 year ago

zxwang-cloak commented 1 year ago

Hi,

Nice pipeline! It is excellent to follow the methods/rules mentioned in the article to distinguish the types of binding (cobinding & tethered binding). So, do you have a ready-made script so we can directly judge the binding type from the MEME results?

Best

singing-scientist commented 1 year ago

Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu may know?

Best, Chase

chpngyu commented 1 year ago

Yes. The step was a downstream analysis that depended on the data formatting. Therefore we didn't have an automated pipeline.

Best, Chun-Ping

Chase W. Nelson 倪誠志 @.***> 於 2023年5月31日 週三 下午3:23寫道:

Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu https://github.com/chpngyu may know?

Best, Chase

— Reply to this email directly, view it on GitHub https://github.com/chpngyu/chip-seq-pipeline/issues/1#issuecomment-1569637481, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.***>

zxwang-cloak commented 1 year ago

Yes. The step was a downstream analysis that depended on the data formatting. Therefore we didn't have an automated pipeline. Best, Chun-Ping Chase W. Nelson 倪誠志 @.> 於 2023年5月31日 週三 下午3:23寫道: Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu https://github.com/chpngyu may know? Best, Chase — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.>

Thank you very much for your quick reply. Right, canonical/noncanonical motifs may be different in different TFs motif results, and thus our verification of each canonical/noncanonical in the "meme-chip.html" file is a requirement. I wonder if you do the same, that is, use eyes instead of codes? Or do you have a better way to count the fraction of peaks containing canonical/noncanonical motifs in a batch work?

Best, Zixian

chpngyu commented 1 year ago

If you are looking for counting the occurrences of a motif in ChIP-seq peak, we use FIMO, which is one of the tools in the MEME suit. If you need to compare two motifs, you can use TOMTOM, another tool in MEME.

The canonical motif of TFs in a same TF family may have a or several similar profiles, so we first identified core motifs in a TF family in our paper. If your TFs are novel, I suggest you check whether a similar profile has been reported in JASPAR family profiles. The collected motifs in JASPAR includes not only human TFs but also TFs in a diverse set of species, e.g. vertebrate https://jaspar.genereg.net/matrix-clusters/vertebrates/

Best, CP.

zxwang-cloak @.***> 於 2023年5月31日 週三 下午7:45寫道:

Yes. The step was a downstream analysis that depended on the data formatting. Therefore we didn't have an automated pipeline. Best, Chun-Ping Chase W. Nelson 倪誠志 @.

> 於 2023年5月31日 週三 下午3:23寫道: … <#m-4220326166802594744> Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu may know? Best, Chase — Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/chpngyu/chip-seq-pipeline/issues/1#issuecomment-1569637481>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.>

Thank you very much for your quick reply. Right, canonical/noncanonical motifs may be different in different TFs motif results, and thus our verification of each canonical/noncanonical in the "meme-chip.html" file is a requirement. I wonder if you do the same, that is, use eyes instead of codes? Or do you have a better way to count the fraction of peaks containing canonical/noncanonical motifs in a batch work?

Best, Zixian

— Reply to this email directly, view it on GitHub https://github.com/chpngyu/chip-seq-pipeline/issues/1#issuecomment-1570037319, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHLQ5T63LEVRXX4YJWTXI4VODANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.***>

zxwang-cloak commented 1 year ago

If you are looking for counting the occurrences of a motif in ChIP-seq peak, we use FIMO, which is one of the tools in the MEME suit. If you need to compare two motifs, you can use TOMTOM, another tool in MEME. The canonical motif of TFs in a same TF family may have a or several similar profiles, so we first identified core motifs in a TF family in our paper. If your TFs are novel, I suggest you check whether a similar profile has been reported in JASPAR family profiles. The collected motifs in JASPAR includes not only human TFs but also TFs in a diverse set of species, e.g. vertebrate https://jaspar.genereg.net/matrix-clusters/vertebrates/ Best, CP. zxwang-cloak @.> 於 2023年5月31日 週三 下午7:45寫道: Yes. The step was a downstream analysis that depended on the data formatting. Therefore we didn't have an automated pipeline. Best, Chun-Ping Chase W. Nelson 倪誠志 @. > 於 2023年5月31日 週三 下午3:23寫道: … <#m-4220326166802594744> Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu may know? Best, Chase — Reply to this email directly, view it on GitHub <#1 (comment) <#1 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.> Thank you very much for your quick reply. Right, canonical/noncanonical motifs may be different in different TFs motif results, and thus our verification of each canonical/noncanonical in the "meme-chip.html" file is a requirement. I wonder if you do the same, that is, use eyes instead of codes? Or do you have a better way to count the fraction of peaks containing canonical/noncanonical motifs in a batch work? Best, Zixian — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHLQ5T63LEVRXX4YJWTXI4VODANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.>

Thanks! Yeap, we had done the steps including using "JASPAR2022_CORE_vertebrates_non-redundant_v2.meme" to annotate the motif and using "FIMO" to count the occurrences of a motif in the ChIP-seq peak. So, your suggestion is that we should first calculate the core motifs using TOMTOM in the TF family, and perform the "FIMO" again with the core motifs to count the fraction of peaks? As for the canonical/noncanonical, I got a little confused about the concept, and I think that if the motifs were not annotated or annotated to another TF family, we called them noncanonical, is this correct?

All the best, Zixian

chpngyu commented 1 year ago

Yes. In our definition, the canonical motif of a TF (TF1) refers to the sequence-specific DNA motif that is directly bound by the TF. And the non-canonical motif is the canonical motif of a second TF (TF2) that belongs to another TF family and it was co-occuring in the ChIP-seq peaks of TF1. Therefore, when you get multiple motifs in ChIP-seq data of a TF (MEME tends to give you multiples), the core motif is the key point to find out which one is the canonical motif of the TF.

zxwang-cloak @.***> 於 2023年5月31日 週三 下午9:51寫道:

If you are looking for counting the occurrences of a motif in ChIP-seq peak, we use FIMO, which is one of the tools in the MEME suit. If you need to compare two motifs, you can use TOMTOM, another tool in MEME. The canonical motif of TFs in a same TF family may have a or several similar profiles, so we first identified core motifs in a TF family in our paper. If your TFs are novel, I suggest you check whether a similar profile has been reported in JASPAR family profiles. The collected motifs in JASPAR includes not only human TFs but also TFs in a diverse set of species, e.g. vertebrate https://jaspar.genereg.net/matrix-clusters/vertebrates/ Best, CP. zxwang-cloak @.

> 於 2023年5月31日 週三 下午7:45寫道: … <#m7388671584299163829> Yes. The step was a downstream analysis that depended on the data formatting. Therefore we didn't have an automated pipeline. Best, Chun-Ping Chase W. Nelson 倪誠志 @. > 於 2023年5月31日 週三 下午3:23寫道: … <#m-4220326166802594744> Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu may know? Best, Chase — Reply to this email directly, view it on GitHub <#1 https://github.com/chpngyu/chip-seq-pipeline/issues/1 (comment) <#1 (comment) https://github.com/chpngyu/chip-seq-pipeline/issues/1#issuecomment-1569637481>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.> Thank you very much for your quick reply. Right, canonical/noncanonical motifs may be different in different TFs motif results, and thus our verification of each canonical/noncanonical in the "meme-chip.html" file is a requirement. I wonder if you do the same, that is, use eyes instead of codes? Or do you have a better way to count the fraction of peaks containing canonical/noncanonical motifs in a batch work? Best, Zixian — Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/chpngyu/chip-seq-pipeline/issues/1#issuecomment-1570037319>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHLQ5T63LEVRXX4YJWTXI4VODANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHLQ5T63LEVRXX4YJWTXI4VODANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.>

Thanks! Yeap, we had done the steps including using "JASPAR2022_CORE_vertebrates_non-redundant_v2.meme" to annotate the motif and using "FIMO" to count the occurrences of a motif in the ChIP-seq peak. So, your suggestion is that we should first calculate the core motifs using TOMTOM in the TF family, and perform the "FIMO" again with the core motifs to count the fraction of peaks? As for the canonical/noncanonical, I got a little confused about the concept, and I think that if the motifs were not annotated or annotated to another TF family, we called them noncanonical, is this correct?

All the best, Zixian

— Reply to this email directly, view it on GitHub https://github.com/chpngyu/chip-seq-pipeline/issues/1#issuecomment-1570279796, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMTBTBPHY2NPEW2QDLXI5EG3ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.***>

zxwang-cloak commented 1 year ago

Yes. In our definition, the canonical motif of a TF (TF1) refers to the sequence-specific DNA motif that is directly bound by the TF. And the non-canonical motif is the canonical motif of a second TF (TF2) that belongs to another TF family and it was co-occuring in the ChIP-seq peaks of TF1. Therefore, when you get multiple motifs in ChIP-seq data of a TF (MEME tends to give you multiples), the core motif is the key point to find out which one is the canonical motif of the TF. zxwang-cloak @.> 於 2023年5月31日 週三 下午9:51寫道: If you are looking for counting the occurrences of a motif in ChIP-seq peak, we use FIMO, which is one of the tools in the MEME suit. If you need to compare two motifs, you can use TOMTOM, another tool in MEME. The canonical motif of TFs in a same TF family may have a or several similar profiles, so we first identified core motifs in a TF family in our paper. If your TFs are novel, I suggest you check whether a similar profile has been reported in JASPAR family profiles. The collected motifs in JASPAR includes not only human TFs but also TFs in a diverse set of species, e.g. vertebrate https://jaspar.genereg.net/matrix-clusters/vertebrates/ Best, CP. zxwang-cloak @. > 於 2023年5月31日 週三 下午7:45寫道: … <#m7388671584299163829> Yes. The step was a downstream analysis that depended on the data formatting. Therefore we didn't have an automated pipeline. Best, Chun-Ping Chase W. Nelson 倪誠志 @. > 於 2023年5月31日 週三 下午3:23寫道: … <#m-4220326166802594744> Greetings! I believe that was a downstream data analysis step, but that the criteria are described in the manuscript. This step may have been manual analysis without an automated pipeline. Perhaps @chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu https://github.com/chpngyu may know? Best, Chase — Reply to this email directly, view it on GitHub <#1 <#1> (comment) <#1 (comment) <#1 (comment)>>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHMVZ3UOEP76JMWBIODXI3WW7ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.> Thank you very much for your quick reply. Right, canonical/noncanonical motifs may be different in different TFs motif results, and thus our verification of each canonical/noncanonical in the "meme-chip.html" file is a requirement. I wonder if you do the same, that is, use eyes instead of codes? Or do you have a better way to count the fraction of peaks containing canonical/noncanonical motifs in a batch work? Best, Zixian — Reply to this email directly, view it on GitHub <#1 (comment) <#1 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHLQ5T63LEVRXX4YJWTXI4VODANCNFSM6AAAAAAYUBLXVM https://github.com/notifications/unsubscribe-auth/AGKRDHLQ5T63LEVRXX4YJWTXI4VODANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.> Thanks! Yeap, we had done the steps including using "JASPAR2022_CORE_vertebrates_non-redundant_v2.meme" to annotate the motif and using "FIMO" to count the occurrences of a motif in the ChIP-seq peak. So, your suggestion is that we should first calculate the core motifs using TOMTOM in the TF family, and perform the "FIMO" again with the core motifs to count the fraction of peaks? As for the canonical/noncanonical, I got a little confused about the concept, and I think that if the motifs were not annotated or annotated to another TF family, we called them noncanonical, is this correct? All the best, Zixian — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKRDHMTBTBPHY2NPEW2QDLXI5EG3ANCNFSM6AAAAAAYUBLXVM . You are receiving this because you were mentioned.Message ID: @.>

Thanks a lot, I got it!