Immortal2333 / Telomeres_and_Centromeres

This is a method to find telomeres and centromeres in plants.
34 stars 6 forks source link

usage of IGV to find centromeres #2

Closed MuyuenHoshino closed 1 year ago

MuyuenHoshino commented 1 year ago

Dear Immortal2333, Thanks for sharing your workflow,it helps me a lot. I have some trouble when finding centromeres by using IGV.I got .gff3 file from .dat file by TRF2GFF,and i put .fa file and .gff3 file in IGV. but it can't be showed bar plot with peaks as you showed in "IGV_all.jpg",it shows "zoom in to see features" 联想截图_20230615214705 The "zoom in" button is grey and i can't click it.When i switch from "ALL" to "PN1",I can get the plot as you showed in "IGV_chr1.jpg". 联想截图_20230615215309 So what should i do now sorry for my awful english best wishes

Immortal2333 commented 1 year ago

Thank you for your question! The issue you're experiencing might be due to having a large number of TRFs in your .gff file. Therefore, it's hard to visualize all of them in ALL chromosomes pattern in IGV software. So I strongly suggest filtering and selecting the specific TRFs one by one using the following workflow:

https://github.com/Immortal2333/Telomeres_and_Centromeres/blob/main/TopFiveRepeatUnit_Excel.pdf

Once you have identified the specific TRFs you're interested in, you can extract them using the following command:

grep 'period=107' genome_trf.gff3 > trf_107bp.split.gff3

This will generate a smaller TRFs .gff file containing only the TRFs with a period of 107bp. You can then see the bar plot (Auto) feature in the IGV and analyze the selected TRFs. \ I hope this helps! Let me know if you have any further questions.

MuyuenHoshino commented 1 year ago

Thank you for your question! The issue you're experiencing might be due to having a large number of TRFs in your .gff file. Therefore, it's hard to visualize all of them in ALL chromosomes pattern in IGV software. So I strongly suggest filtering and selecting the specific TRFs one by one using the following workflow:

https://github.com/Immortal2333/Telomeres_and_Centromeres/blob/main/TopFiveRepeatUnit_Excel.pdf

Once you have identified the specific TRFs you're interested in, you can extract them using the following command:

grep 'period=107' genome_trf.gff3 > trf_107bp.split.gff3

This will generate a smaller TRFs .gff file containing only the TRFs with a period of 107bp. You can then see the bar plot (Auto) feature in the IGV and analyze the selected TRFs. I hope this helps! Let me know if you have any further questions.

Dear Immortal2333, Thank you for your prompt reply. Actually,I tried to extract specific TRFs with "grep" ,but it didn't make sense 联想截图_20230616001142 联想截图_20230616001246 The last three are the gff3 files after grep,with a period of 107bp,214bp,428bp After putting the gff3 file into IGV, did you do any other processing or settings? I'm new to bioinformatics and probably asked very basic questions, thanks for your patience best wishes

Immortal2333 commented 1 year ago

Maybe this link (IGV tutorial) could answer your questions:

https://software.broadinstitute.org/software/igv/ChangeDataDisplay
MuyuenHoshino commented 1 year ago

Maybe this link (IGV tutorial) could answer your questions:

https://software.broadinstitute.org/software/igv/ChangeDataDisplay

Dear Immortal2333, Thanks for your prompt reply,I have already slove this problem.I saw a tutorial before about how to visual .gff3 file in IGV,that tutorial told me to use igvtools to generete sorted.gff3 and sorted.gff3.idx,and put the *sorted.gff3 into IGV.But when I put .gff3 file into IGV,it shows the correct result.It's so stupid to make such a mistake, sorry for taking your time, thank you again for sharing and prompt reply. Best wishes