Closed aazaff closed 7 years ago
GDD corpus: "In the 2.7-2.8 mil range. I'd say closer to 2.7" - Ian Number of documents app was run on: 76,111 (probably a few less because of chunk-splitting being done by row, not by doc) Total number of sedimentary formations in Macrostrat: 5,022 Total number of cleaned sedimentary formations in Macrostrat: 4,682 (number of sedimentary formations in macrostrat with a t_age younger than Precambrian. Not including Muddy Sandstone, Mutual Formation, or Sandy Limestone) Total number of non-candidate units (in PBDB): 2,021 Total number of candidate units (not in PBDB): 2,661 Total number of units matched to any document (candidate and non-candidate): 1,847 Total number of non-candidate units matched to any document: 1,115 Total number of candidate units matched to any document: 732 Total number of candidate units not matched to any document: 1,929
see stats page for script
UPDATE: Re-running app to include critical stats calculations in output. (These numbers will be recalculated for final app run)