Open mozack opened 1 year ago
The CAT pipeline was dependent on the Minigraph-Cactus graph, resulting in its applicability to only 44 samples (HG002, HG005, NA19240 were set aside to facilitate their use in benchmarking). Conversely, the Ensembl pipeline should include gene annotations for all 47 samples. The link to access the Ensembl gene annotations is: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=submissions/8E6C4ACC-FEA9-4DD8-94A3-B92234206F95--Y1_ENSEMBL_V1/
@mhaukness-ucsc, could you please check if the above link is the version used in the HPRC marker paper?
@juklucas, in your opinion, should we consider providing an index file for the Ensembl gene annotations as well?
Thanks so much! I see the Ensembl annotations and will try them out.
The above link should be correct for CAT for comparisons to marker paper results; however Ensembl should be used for new analysis.
Hi,
Thank you for this fantastic resource!
The CAT genes index does not appear to have annotation entries for 3 samples: HG002 HG005 NA19240
https://github.com/human-pangenomics/HPP_Year1_Assemblies/blob/main/annotation_index/Year1_assemblies_v2_genbank_CAT_genes.index
Are the gene annotations for these 3 samples available elsewhere?
Thanks!