Closed joseluisdiaz closed 8 years ago
Hi, node id is an internal genemania id, this should be getting generated in an earlier pipeline step and is not something you need to provide in your input files. You may have hit a bug or some file format issue.
Those files you see in the genemania.org/data/ folder are a distant output of the pipeline (actually probably an earlier version), and the formats there may not be suitable to drop back in as inputs again.
But its not clear to me why you want to round-trip the data through. Are you just looking for a set of sample input files to the pipeline, or do you really want a binary genemania dataset for this particular set of networks? If the latter, you can probably get retrieve the binaries directly via cytoscape.
My goal is to import all networks from HomoSapiens to a local copy that i have running of genemania. Since all information from Homo Sapiens (in genemania.org/data). I did a few things with the data from genemania.org/data trying to convert to the Input format of the pipeline (a few scripts):
If you point into the snakemake task that should generate the id's i'll try to fix, but if you recomend that i should get the binaries using Cytoscape i'll do that. Will be great if you giveme a few directions.
Great that you are able to setup all this input data, its a fair bit of work to sort through! I'll be happy to help you troubleshoot, but let me make sure you know all your options first:
knowing all the above, if you really want to use the website (not plugin) with custom data, you probably need to get the pipeline running as you are doing now. To help with this, I suggest you first try quickly building a test dataset. That way we can rule out any problems with your toolchain setup, and you'll be able to examine the sample input data files to see examples of the formats. I notice i didn't commit test data to this pipeline project, but i was working on a test dataset a while back in my fork here, its in the test_data branch. Checkout that branch and try running:
snakemake --config test=1
it should run to completion without error. Look in the test/data/ subdirectory for input files and test/result/ for the processed outputs.
Thank you so much, i'll try to download the data using genmania plugin. That will be my first option.
Everthing works just fine, thanks! :-)
I'm trying to import http://genemania.org/data/current/Homo_sapiens/ using this pipeline. I have a question regarding the network definitions/identifiers, it seems to be String ( for example http://genemania.org/data/current/Homo_sapiens/Predicted.I2D-Ptacek-Snyder-2005-Yeast2Human.txt), and Generic2LuceneExporter expect a Long, what i'm doing wrong?