jtlovell / GENESPACE

Other
184 stars 24 forks source link

Location of the Collinearity files #66

Closed sanyalab closed 1 year ago

sanyalab commented 1 year ago

Hi

I was wondering if the collinearity files generated from the MCScanX run are retained somewhere in the intermediate outputs of GENESPACE. If so, can you send me the location? If not, is there a method to format the syntenic orthogroup output (you provided me the script in #31) in the collinearity format of MCScanX?

Thanks Abhijit

jtlovell commented 1 year ago

We don't keep those around. As noted in the paper, MCScanX is just used to find initial 'seed' collinear hits. But, these and especially the actual blocks coming out of MCScanX are not trustable and need to be further refined (see fig 1b of the paper). You can re-create the output files by parsing the syntenic hit files to those flagged with isAnchor = TRUE, but I wouldn't recommend it.

sanyalab commented 1 year ago

Hi,

I have another question. I have a three col relationship file, MCScanX collinearity file and the Gene GFF. Is there a method to utilize any or all of these inputs and build the orthogroups? How?

Can this be developed and incorporated if not already present?

Thanks Abhijit

jtlovell commented 1 year ago

there is not - GENESPACE runs MCScanX internally on curated blast hits. The difference between the left and right dotplot in Fig. 1B here illustrates why we don't trust raw MCScanX results.