Closed lukdo closed 8 years ago
My flights.graphml file looks like this, but longer: And some nodes are double
<?xml version="1.0" ?>
<graphml>
<key attr.name="label" attr.type="string" id="label"/>
<graph edgedefault="directed" id="">
<node id="AiAiyamato1"/>
<node id="AbiYusuf37"/>
<node id="Chaima_vmn"/>
<node id="abu_camil"/>
<node id="ollspam"/>
<node id="RusCountering"/>
<node id="isayful082"/>
<edge directed="false" source="AbiYusuf37" target="AiAiyamato1"/>
<edge directed="false" source="Chaima_vmn" target="AbiYusuf37"/>
<edge directed="false" source="nfb9a7s8771" target="AbiYusuf37"/>
<edge directed="false" source="isayful082" target="NovostiDamask"/>
</graph>
</graphml>
Have you tried just loading the data into igraph on its own?
import igraph
G=igraph.load('<yourgraphml')
It's probably not the source of your problem but it is probably better to create a new dataset (rather than reusing flights). You should be able to follow the instructions here: https://github.com/Lab41/Circulo/tree/master/circulo/data
In looking through those instructions there is a step missing for a new dataset (I've now added it to the readme). In setup/run_algos.py there is a list called "data_choices" which you have to add your algorithm to
Good news: I changed the import method to the one you proposed, using the same as "malaria", created my dataset, and it worked. The import works now for other .graphml files. I still have the same problem for my own dataset though.
I have to work on my graphml file because it is not recognized.
Do you know why that line is wrong? It is created by default by the pygraphml library.
If you have ways of deleting nodes which ID already exist, and ways of deleting unconnected nodes I would be very interested.
Thank you very much for your help
I'm not sure what exactly the key to that line is. It's not the best solution but you could use pygraphml to read the graphml file and construct an iGraph graph from the other graph representation.
As far as pruning it looks like the closest example is from the as_data example. The relevant code lines are:
# Take largest connected component
components = g.components(mode=igraph.WEAK)
if len(components) > 1:
g = g.subgraph(max(components, key=len))
g.write_graphml(self.graph_path)```
Finally i changed from "pygraphml" to "NetworkX" to write my graphml file and it gets read from the circulo library without issues.
Thank you for your help ymt123
Problem solved :)
Great!
Hello, I am trying to test the algorithms on a .graphml file I created. I am doing my Master Thesis on community finding, using graph algorithms. At the end I could add my dataset to circulo. But I get this error:
File "/usr/lib/python3.4/multiprocessing/pool.py", line 599, in get raise self._value SystemError: error return without exception set
It works for all the other datasets.
What I did: I put my file in the raw folder from "flights" naming it "flights.graphml". In my run.py file I have:
import os import shutil from circulo.data.databot import *
FILE = "flights.graphml"
class FollowersData(CirculoData):
def main(): FollowersData("flights").get_graph()
if name == "main": main()
I took this code from the "southernwoman" data that is also already in a graphml format.
When i execute "python3 run_algos.py flights ALL --output algorithm_results" I get this script in the terminal:
dalys@dalys ~/Documents/circulo/circulo/setup $ python3 run_algos.py followers ALL --output algorithm_results [Graph Generation ETL for followers ] multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker result = (True, func(_args, _kwds)) File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar return list(map(_args)) File "run_algos.py", line 168, in data_fetcher databot.get_graph() File "/home/dalys/Documents/circulo/circulo/data/databot.py", line 91, in get_graph return igraph.load(self.graph_path) File "/usr/local/lib/python3.4/dist-packages/igraph/init.py", line 4063, in read return Graph.Read(filename, args, _kwds) File "/usr/local/lib/python3.4/dist-packages/igraph/init.py", line 2223, in Read return reader(f, _args, _kwds) SystemError: error return without exception set """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "run_algos.py", line 304, in
main()
File "run_algos.py", line 299, in main
run(algos, datasets, args.output[0], args.samples, args.workers, args.timeout)
File "run_algos.py", line 210, in run
r.get()
File "/usr/lib/python3.4/multiprocessing/pool.py", line 599, in get
raise self._value
SystemError: error return without exception set
dalys@dalys ~/Documents/circulo/circulo/setup $
Thank you for reading, i hope you can help