fraenkel-lab / OmicsIntegrator

This repository is the working directory for the Garnet-Forest bundle of python scripts for analyzing diverse forms of 'omic' data in a network context.
http://fraenkel.mit.edu/omicsintegrator
BSD 2-Clause "Simplified" License
31 stars 20 forks source link

Parsing problems in FOREST_INPUT.xls file from Garnet #2

Closed agitter closed 9 years ago

agitter commented 9 years ago

When Garnet parses the motif ids in events_to_genes_with_motifsregression_results.xls there are a few problems I've noticed in the resulting events_to_genes_with_motifsregression_results_FOREST_INPUT.xls, which contains individual TFs instead of motifs.

Using .csv or a tab-delimited text file instead of .xls could help with the second and third issue.

sgosline commented 9 years ago

Ok, I can work on this either later this week or early next week.

sgosline commented 9 years ago

I fixed the second two issues, but the third issue is not that any TFs have a '.1' in the name. The issue is actually introduced by some 'scrubbing' of identifiers from the TAMO source file, I removed the gene name that has a '.1' in it, as it is not legitimate in any species. This is fixed and will be committed with my latest updates to allow for GPS file parsing.