xjtu-omics / HiCAT

HiCAT new project
Other
25 stars 2 forks source link

HiCAT #1

Closed duhuipeng closed 1 year ago

duhuipeng commented 1 year ago

Dear author Could you share the code of these two pictures? image image Best HuipengDu

yangxiaofeill commented 1 year ago

Hi Huipeing, The dotplot is generated by Gepard [https://github.com/univieCUBE/gepard] The barplot is plotted by python, and the code is

file = 'out_hor.normal.fa' # input HOR fa
target_pattern = 'R3L6'
outfile = target_pattern+'.pdf' # outfile

dis_table = {}
rHORs = set()

rHORs_table = {}
with open(file,'r') as f:
    while True:
        line = f.readline()[:-1]
        if not line:
            break
        header = line
        seq = f.readline()[:-1]
        items = header.split(' ')
        pattern = items[0].split('::')[0][1:]
        if pattern != target_pattern:
            continue
        anno = items[1].split('::')
        rHor = anno[1][5:].split('_')
        if len(rHor) not in dis_table.keys():
            dis_table[len(rHor)] = 1
        else:
            dis_table[len(rHor)] += 1
        # print(header)
        for i in rHor:
            if i not in rHORs_table.keys():
                rHORs_table[i] = 1
            else:
                rHORs_table[i] += 1
mindis = sorted(list(dis_table.keys()))

final_dis_table = {}
for i in mindis:
    if i < 30:
        final_dis_table[i] = dis_table[i]
    else:
        if 30 not in final_dis_table.keys():
            final_dis_table[30] = dis_table[i]
        else:
            final_dis_table[30] += dis_table[i]
print(final_dis_table) # bar values
mindis = sorted(list(final_dis_table.keys()))
x = range(len(mindis))
y = []
tick_label = []
for i in mindis:
    y.append(final_dis_table[i])
    tick_label.append(i)

import numpy as np
import matplotlib.pyplot as plt
bar_width = 0.3
plt.figure(figsize=(10,8))
plt.bar(x, y, tick_label=tick_label, width=bar_width)
plt.savefig(outfile)
plt.close()
duhuipeng commented 1 year ago

Dear author What is the code command? I'm under this out_hor.normal.fa file image Why my command line generation doesn't generate results My code is as follows: python3 barplot.py my result as follow: image

duhuipeng commented 1 year ago

Dear author What are the parameters you set for your word and window using gepard? Best HuipengDu

duhuipeng commented 1 year ago

Dear author If I can clearly see the regular HOR, shall I set word and window as the length of HOR? How do I set these parameters if there is no apparent regularity of HOR? image image This is my result under different Windows,How should an indicator selection be made to determine which one is more suitable

Best HuipengDu