tanlongzhi / dip-c

Tools to analyze Dip-C (or other 3C/Hi-C) data
62 stars 18 forks source link

Discrepancy in txt file names between 2018 and 2019 publications #57

Closed carliemcgrath closed 1 year ago

carliemcgrath commented 1 year ago

Hi @tanlongzhi! I'm attempting to work with the final 3D genome txt files from the 2019 publication, downloaded from (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121791), and the GEO site says 'please see 00README.md' to find information on data processing. In the similar GEO site for the 2018 publication, the 'data processing' section mentions that the final 3D genome files for each cell are called 'impute3.round4.3dg', for example, 'GSM3271347_gm12878_01.impute3.round4.3dg.txt'. However, the files in the 2019 publication do not have the same format. I see files such as 'GSM3446115_cell_014.20k.2.clean.3dg.txt', are these the final 3D genome files? Where can I find this 00README.md, assuming that it is different from the README.md and README_old.md in this repo? Thank you so much!