pulibrary / dpul-collections

An inspiring environment for global communities to engage with diverse digital collections
1 stars 0 forks source link

Don't query for unnecessary objects in the Figgy DB for Hydration #156

Open tpendragon opened 4 hours ago

tpendragon commented 4 hours ago

Acceptance Criteria

First Step

Run a query to see # of objects by internal resource in Figgy, post it here so we can decide which classes to exclude.

tpendragon commented 2 hours ago

Resource counts:

figgy_production=# select internal_resource, COUNT(*) FROM orm_resources GROUP BY internal_resource;
    internal_resource    |  count
-------------------------+----------
 CDL::ResourceChargeList |     1818
 Collection              |      651
 DeletionMarker          |   718306
 EphemeraBox             |      608
 EphemeraField           |      135
 EphemeraFolder          |    69663
 EphemeraProject         |       29
 EphemeraTerm            |     1563
 EphemeraVocabulary      |       45
 Event                   | 19456476
 FileSet                 | 11709432
 Numismatics::Accession  |      887
 Numismatics::Coin       |    18437
 Numismatics::Firm       |      201
 Numismatics::Issue      |    11541
 Numismatics::Monogram   |       56
 Numismatics::Person     |     2447
 Numismatics::Place      |     1436
 Numismatics::Reference  |      506
 Playlist                |      816
 PreservationObject      | 10234691
 ProxyFileSet            |     7605
 RasterResource          |     1031
 ScannedMap              |    36124
 ScannedResource         |   217298
 Template                |       46
 VectorResource          |     9028
(27 rows)
tpendragon commented 2 hours ago

Looks like removing PreservationObjects and Events will get rid of like 30 million rows.