Open kylehuynh205 opened 1 year ago
Amy tried the module https://github.com/mjordan/islandora_repository_reports and found It covers the usages for:
But not:
After the demo, we see that the module provide the report in Charts and can be exported to CSV. However The question is do we need to include the count for the sub-collection because we're not sure if how much this use case is needed, also there is ongoing ticket related to this https://github.com/mjordan/islandora_repository_reports/issues/24
Amy is going to fork the module and study the code a bit ONLY to see if we can extend this module for now, but there is no development involved yet.
I looked though the module and found 2 straightforward ways of extending the module:
Approach 1: Adding a new report type to the main module can be done by creating a new datasource file under /src/Plugin/DataSource
and then adding it to the services.yml
file
Approach 2: Creating a separate module for the new report type like the modules under /modules
which will need to be enabled separately on the site.
Using approach 1, I created a new report type for displaying the number of media, grouped by collection (missing use case mentioned previously) [forked repo: amym-li/islandora_repository_reports]. However, it has the same issue where it does not include media counts for subcollections or compound objects.
Example: This graph displays the media counts for the top-level collections. There are ~323 media files in the repository total, but the graph only counts 10 total since it only counts the pieces of media that are immediate children of the collection.
I pushed the new media-count-by-collection report type to a branch at digitalutsc/islandora_repository_reports. This report type also has the issue where it only counts the "first layer" of media (i.e. ignores media in subcollections and compound objects).
Some suggestions to resolve this issue:
Recursively create a list of descendants and then sum up the node/media counts for each descendant.
In /src/Utils.php
,
+ /**
+ * Gets the node ids of all descendants of a given node.
+ *
+ * @param string|int|null $parent_id
+ * The node to check.
+ *
+ * @param array $discovered
+ * An array containing discovered descendants.
+ *
+ * @return array
+ * An array containing node ids of $parent_id's descendants.
+ */
+ public function getDescendants($parent_id, $discovered=[]) {
+ if (is_null($parent_id)) {
+ return [];
+ }
+
+ if (!in_array($parent_id, $discovered)) {
+ $discovered[] = $parent_id;
+ }
+
+ // Get the parent node's immediate children
+ $children_query = \Drupal::entityQuery('node')->condition('field_member_of', $parent_id);
+ $children_result = $children_query->execute();
+ $children = array_values($children_result);
+
+ // Remove already discovered children
+ $children = array_diff($children, $discovered);
+
+ // Mark new children as discovered
+ $discovered = array_merge($children, $discovered);
+
+ $descendants = $children;
+ foreach ($children as $child) {
+ $grandchildren = $this->getDescendants($child, $discovered);
+ $descendants = array_merge($descendants, $grandchildren);
+ }
+
+ return array_unique($descendants);
+ }
Example usage in /src/Plugin/DataSource/Collection.php
,
public function getData() {
$utilities = \Drupal::service('islandora_repository_reports.utilities');
if (count($utilities->getSelectedContentTypes()) == 0) {
return [];
}
$entity_type_manager = \Drupal::service('entity_type.manager');
$node_storage = $entity_type_manager->getStorage('node');
$result = $node_storage->getAggregateQuery()
->groupBy('field_member_of')
->aggregate('field_member_of', 'COUNT')
->condition('type', $utilities->getSelectedContentTypes(), 'IN')
->execute();
$collection_counts = [];
foreach ($result as $collection) {
if (!is_null($collection['field_member_of_target_id'])) {
if ($collection_node = \Drupal::entityTypeManager()->getStorage('node')->load($collection['field_member_of_target_id'])) {
if ($utilities->nodeIsCollection($collection_node)) {
$collection_counts[$collection_node->getTitle()] = $collection['field_member_of_count'];
+ // Get all child nodes belonging to this collection
+ $children = $utilities->getDescendants($collection['field_member_of_target_id']);
+
+ // Sum up the member_of counts for all children
+ foreach ($children as $child_id) {
+ $child_result = array_search($child_id, array_column($result, 'field_member_of_target_id'));
+ $collection_counts[$collection_node->getTitle()] += $result[$child_result]['field_member_of_count'];
+ }
}
}
}
}
$this->csvData = [[t('Collection'), 'Count']];
foreach ($collection_counts as $collection => $count) {
$this->csvData[] = [$collection, $count];
}
return $collection_counts;
}
Example usage in /src/Plugin/DataSource/MediaByCollection.php
,
public function getData() {
$utilities = \Drupal::service('islandora_repository_reports.utilities');
$entity_type_manager = \Drupal::service('entity_type.manager');
$media_storage = $entity_type_manager->getStorage('media');
$result = $media_storage->getAggregateQuery()
->groupBy('field_media_of')
->aggregate('field_media_of', 'COUNT')
->execute();
$media_counts = [];
foreach ($result as $collection) {
if (!is_null($collection['field_media_of_target_id'])) {
if ($collection_node = \Drupal::entityTypeManager()->getStorage('node')->load($collection['field_media_of_target_id'])) {
if ($utilities->nodeIsCollection($collection_node)) {
$media_counts[$collection_node->getTitle()] = $collection['field_media_of_count'];
+ // Get all child nodes belonging to this collection
+ $children = $utilities->getDescendants($collection['field_media_of_target_id']);
+
+ // Sum up the media_of counts for all children
+ foreach ($children as $child_id) {
+ $child_result = array_search($child_id, array_column($result, 'field_media_of_target_id'));
+ $media_counts[$collection_node->getTitle()] += $result[$child_result]['field_media_of_count'];
}
}
}
}
}
$this->csvData = [[t('Collection'), 'Count']];
foreach ($media_counts as $collection => $count) {
$this->csvData[] = [$collection, $count];
}
return $media_counts;
}
Use a solr query to get a list of all descendant nodes.
Solr has a itm_field_descendant_of
field that stores a list of ids belonging to the node's ancestors.
(See https://github.com/mjordan/islandora_repository_reports/issues/24#issuecomment-631474177)
Example query: /select?q=itm_field_descendant_of:99
returns all nodes that have node 99 as an ancestor
Then loop through returned nodes and total the node/media counts.
Slight modification:
This ticket is meant for starter site, build https://github.com/Islandora-Devops/isle-dc with
make starter_dev
, requires ingest content with islandora_workbench.TODO:
Use Views (potentially its aggregation feature, may need other module to assist) to generate the below reports for objects in the repository:
Scalability:
Hint:
Started view (started by Nat):