LegumeFederation / legfed_gene_families

A repository for managing tasks relating to the production of gene families for use by the Legume Federation
0 stars 0 forks source link

some trees have single-quotes around their node labels- intentional or can we make consistent? #7

Closed adf-ncgr closed 7 years ago

adf-ncgr commented 7 years ago

@cann0010 were these manually processed in some way? 37_trees_combined]$ ls | xargs grep -l "'" legfed_2017.L.00135 legfed_2017.L.004C4 legfed_2017.L.0058Z legfed_2017.L.0071V legfed_2017.L.007VG legfed_2017.L.00B6Q legfed_2017.L.00BVJ legfed_2017.L.00C56 legfed_2017.L.00C64 legfed_2017.L.00DCM legfed_2017.L.00Z9H legfed_2017.L.014Z5 legfed_2017.L.019ZZ legfed_2017.L.030XG legfed_2017.L.034TV legfed_2017.L.04B6M

I can get rid of them for loading purposes, but would be good to make things consistent in this regard if possible.

StevenCannon-USDA commented 7 years ago

Yes, they were. Trees for the largest of the families were calculated with FastTree and then manually rooted using FigTree. I suspect the single-quotes were introduced during the manual rooting. This affected 16 trees.

Corrected now in genefams_ks_mcl_2017/37_trees_combined.tar.gz with ls L.0* | xargs grep -l "'" | xargs -I{} perl -pi -e 's/'\''//g' {}