RichieJu520 / Co-occurrence_Network_Analysis

R and python scripts for correlation-based network analysis
56 stars 34 forks source link

errors in "3.Random_vs_observed_cooccurrence.py" runing #1

Closed xphab closed 5 years ago

xphab commented 6 years ago

Dear Dr. Ju, in the end of line 14, there is a ' missing. Thanks for your great works

Many thanks! I will check and make needed revisions.

RichieJu520 commented 6 years ago

Dear Peng, Thank you for the feedback! It appears that I carelessly delete one quote symbol at line 11 or 12. Now the problem should be solved. You can a copy of the update script in the attachment.

Best, Feng ​

On Wed, Apr 4, 2018 at 9:25 AM, xphab notifications@github.com wrote:

Dear Dr. Ju, Your scripts are very useful and helpful. Thanks a lot for your great work. When I run the python script. It showed : "File "Random_vs_observed_cooccurrence.py", line 14 print 'The map file is a tab-delimited file with node ID (col 1) and type of node (col 2) ^ SyntaxError: EOL while scanning string literal" I have tried windows and ubuntu OSs and I'm using the python 2.7.14(windows) and 2.7.12(ubuntu). The map file and gml file are "OTU_Order.map" and "Pos0.6-NW.modified.gml" in the example and example_output folders. Could you please give me some advice? Thanks very much.

Best, Peng

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/RichieJu520/Co-occurrence_Network_Analysis/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AM4Tongg651l8AgH-t72gh8gwJq1BdjPks5tlHVvgaJpZM4TGS5- .

-- Feng JU, Ph. D. Postdoctoral Scientist Microbial Ecology Group, Department of Surface Waters Swiss Federal Institute of Aquatic Science and Technology (eawag) Seestrasse 79, 6047 Kastanienbaum Switzerland Phone: +41 58 765 2163 Fax: +41 58 765 2168 http://www.eawag.ch/en/aboutus/portrait/organisation/staff/profile/feng-ju/

-- coding: utf-8 --

""" @author: Feng Ju @email: richieju520@gmail.com The script was written and tested in python 2.7

@cite Ju F, Xia Y, Guo F, Wang ZP, Zhang T. 2014.

@Taxonomic relatedness shapes bacterial assembly in activated sludge of globally distributed wastewater treatment plants.

@Environmental Microbiology. 16(8):2421-2432

"""

print 'This script is written for adding attribute to each NODE in GML files before imported into Gehpi' print 'This script also calculate the random and observed incidences of co-occurrence between differnt types of network nodes!' print 'The map file is a tab-delimited file with node ID (col 1) and type of node (col 2)' print 'The gml file is the gml-format network file generated from the R scripts'

while True: Parameters=raw_input("Enter parameters map file and [GML file] sepeated by Space: ") try: P1=Parameters.strip().split(' ')[0] P2=Parameters.strip().split(' ')[1] break except: print 'errors: invalid input format or not enough input !' continue

a={} for line in open(P1,'r'): try: a[line.rstrip().split('\t')[0]]=line.rstrip().split('\t')[1] except: print 'Pls provdied a tab-delimited txt file!' break

f=open(P2.replace('.gml','') +'.modified.gml','w') f1=open(P1 +'.map','w')

b=[]

i=0 for line in open(P2,'r'): if 'label' not in line: f.write(line) else: i+=1 name = line.strip().split('"')[1] f.write(line) try: f1.write(name+'\t'+str(a[name])+'\n') f.write(' '+'order'+' '+str(a[name])+'\n') except KeyError: print name print i,'items were added into the node!' print 'OK, work finished!' f.close() f1.close()

X1 = P1 +'.map' X2 = P2.replace('.gml','') +'.modified.gml'

f=open(X2.replace('.gml','')+'_Observed_VS_Random.xls','w') f1=open(X2.replace('.gml','')+'_edge_properties.xls','w') f2=open(X2.replace('.gml','')+'_node_properties.xls','w')

a1,a2={},{} lis=[] for line in open(X1,'r'): a1[line.strip().split('\t')[0]]=line.strip().split('\t')[1] lis.append(line.strip().split('\t')[1]) print len(a1), 'Node-affiliation pairs!'

dic1={} lis_U=list(set(lis)) for item in lis_U: dic1[item]=lis.count(item)

f2.write('\t'.join(['id','nmae','phylum','degree'])+'\n') b,c,d,e,g = [],[],[],[],[] for line in open(X2,'r'): if ' id ' in line: f2.write(line.strip().split(' ')[1]+'\t') b.append(line.strip().split(' ')[1]) elif ' name' in line: f2.write(line.strip().split(' ')[1].replace('"','')+'\t') c.append(line.strip().split(' ')[1].replace('"','')) f2.write(a1[line.strip().split(' ')[1].replace('"','')]+'\t') elif ' degree' in line: f2.write(line.strip().split(' ')[1]+'\n') elif 'source' in line: d.append(line.strip().split(' ')[1]) elif 'target' in line: e.append(line.strip().split(' ')[1]) elif 'weight' in line: g.append(line.strip().split(' ')[1])
else: continue

print len(b), len(c),'nodes!' print len(d), len(e),'edges!'

for i in range(len(b)): a2[b[i]]=c[i]

j,k=0,0 n=[] p=[]

f1.write('Source'+'\t'+'Phylum'+'\t'+'Target'+'\t'+'Phylum'+'\t'+'Weight'+'\n') for m in range(len(g)): if a1[a2[d[m]]]==a1[a2[e[m]]]: f1.write(str(a2[d[m]])+'\t'+str(a1[a2[d[m]]])+'\t'+str(a2[e[m]])+'\t'+str(a1[a2[e[m]]])+'\t'+str(g[m])+'\n') j+=1 n.append(a1[a2[d[m]]]+''+a1[a2[e[m]]]) else: f1.write(str(a2[d[m]])+'\t'+str(a1[a2[d[m]]])+'\t'+str(a2[e[m]])+'\t'+str(a1[a2[e[m]]])+'\t'+str(g[m])+'\n') k+=1 p.append(a1[a2[d[m]]]+''+a1[a2[e[m]]])

print j, 'Internal-type cooccurence!' print k, 'External-type cooccurence!'

n1=list(set(n)) n1.sort() dic2={} for item in n1: dic2[item]=str(n.count(item))

p1=list(set(p)) p1.sort() for item in p1: i1=p.count(item) i2=p.count(item.split('')[1]+''+item.split('')[0]) dic2[item]=i1+i2 if i2!=0: p1.remove(item.split('')[1]+''+item.split('')[0]) else: continue

f.write(str(len(c))+' '+'nodes and'+' '+ str(len(e))+' '+'edges in the network!'+'\n') f.write(str(j)+' '+'edges with internal-type cooccurence!'+'\n') f.write(str(k)+' '+'edges with external-type cooccurence!'+'\n')

f.write('N1N2'+'\t'+'N1-freq'+'\t'+'N2-freq'+'\t'+'Edges'+'\t'+'Random'+'\t'+'Observed'+'\t'+'O/R-ratio'+'\n') for key in n1: i1=dic1[key.split('')[0]] i2=dic1[key.split('')[1]] if key.split('')[0]==key.split('')[1]: random=100float(i1(i2-1))/(len(c)(len(c)-1)) else: random=2100float(i1i2)/(len(c)(len(c)-1)) observed=100float(dic2[key])/len(e) ratio=observed/random f.write('\t'.join([key, str(i1), str(i2), str(dic2[key]), str(random), str(observed), str(ratio)])+'\n') for key in p1: i1=dic1[key.split('')[0]] i2=dic1[key.split('')[1]] if key.split('')[0]==key.split('__')[1]: random=100float(i1(i2-1))/(len(c)(len(c)-1)) else: random=2100float(i1i2)/(len(c)(len(c)-1)) observed=100float(dic2[key])/len(e) ratio=observed/random f.write('\t'.join([key, str(i1), str(i2), str(dic2[key]), str(random), str(observed), str(ratio)])+'\n')

f.write('\n') f.write('Node-affiliation'+'\t'+'Nodes_freq'+'\n')

list_s = sorted(dic1, key=dic1.getitem, reverse = True) for key in list_s: f.write(key+'\t'+str(dic1[key])+'\n') print 'OK, finished!'

BigTreeing commented 5 years ago

Dear Dr. Ju,

Thank you for your great work. It helps a lot.

But there were some problems while I was running ("3.Random_vs_observed_cooccurrence.py"). >m<

When I loaded two target files (.map .gml) to run, it could be executed smoothly. But there are no contents in the two generated files (i.e., "Pos0.6.modified_node_properties" and "Pos0.6.modified_Observed_VS_Random"). However, when I executed the sample you've given, there was nothing wrong. I have no ideas about this.

In a word, I'm looking forward to your solution to my confusion. Thanks a lot.

Sincerely, BigTree

RichieJu520 commented 5 years ago

Dear Dr. Ju,

Thank you for your great work. It helps a lot.

But there were some problems while I was running ("3.Random_vs_observed_cooccurrence.py"). >m<

When I loaded two target files (.map .gml) to run, it could be executed smoothly. But there are no contents in the two generated files (i.e., "Pos0.6.modified_node_properties" and "Pos0.6.modified_Observed_VS_Random"). However, when I executed the sample you've given, there was nothing wrong. I have no ideas about this.

In a word, I'm looking forward to your solution to my confusion. Thanks a lot.

Sincerely, BigTree

BigTree, thanks for reporting. Sorry for my late response. There may be format problems with your input files. Have your problems solved? Feng

BigTreeing commented 5 years ago

Dear Dr. Ju,

Finally, I've solved these problem, that is, I added some code to modify this file ("3.Random_vs_observed_cooccurrence.py").

I added "f.close()" after "f.write()" codes. Despite I didn't know why, it worked well. BTW, I run it in python3 environment.

Anyway, thank you so much.

Sincerely, BigTree