When providing the search results, the SWR Catalogue will display a dataset or knowledge source to a user and show in which external repositories are, e.g., such a dataset available. The SWR also provides metadata, data & knowledge validation, among others, checks automatically persistence of available data/knowledge (e.g. if a dataset uploaded into Zenodo still exists or if a Web service endpoint is still alive and providing data). https://github.com/soilwise-he/Soilwise-userstories/issues/18
[ ] data completeness (only internal use, no reporting!!)
[ ] checks automatically persistence of available data/knowledge (is this not the same as/overlap with "display broken links"?)
With acceptance criteria:
Display a Dataset
[x] Users can search for datasets using keywords, filters (e.g., date range, author), and advanced search options.
[x] Search results for datasets are displayed within 2 seconds for typical queries.
[x] Datasets are displayed in a user-friendly format with information about title, source, publication date, and summary.
[x] The user shall be able to view detailed information of the dataset by clicking on the search result, which opens a detailed view page.
[x] Users can sort datasets by columns (e.g., date, title).
[x] Datasets with more than 50 entries are paginated.
[x] Users can navigate between pages using 'Next', 'Previous', and direct page number buttons.
Display a Knowledge Source
[x] Users can search for knowledge sources using keywords, filters (e.g., publication year, author), and advanced search options.
[x] Knowledge source search results are displayed within 2 seconds for typical queries.
[x] Each knowledge source is displayed with metadata including title, author, publication date, abstract, and source URL.
[x] Users can view the full text of the knowledge source directly or via a provided link.
[x] Knowledge sources are categorized (e.g., articles, books, reports).
[x] Users can filter search results by category.
Show Originating External Repositories
[x] Each document in the search results displays the originating external repository.
[ ] Users can view a list of all external repositories included in the search.
[ ] Users can click on the repository name to view more details (e.g., repository description, URL).
[x] The user shall be able to filter search results by external repositories.
Display Linked Items
[x] Each document displays related documents or references as linked items.
[x] Linked items are displayed as clickable links.
[x] Clicking a linked item shows its detailed metadata and a content preview.
Display Duplicities
[ ] The system automatically identifies and flags duplicate documents in the search results.
[ ] Users are notified of duplicate documents and can choose to view or ignore duplicates.
[ ] The system shall provide an option to merge duplicate entries and keep the most relevant information.
Display Broken Links
[x] The system automatically checks for broken links during data ingestion and periodically thereafter.
[ ] Users are notified of broken links with a clear message and the option to report or ignore the issue.
[ ] Broken links are highlighted in search results, and users can view a list of all broken links.
Data Quality Assurance
[ ] The system performs automatic checks to ensure data integrity (e.g., correct formats, valid values).
[ ] Each document displays a quality score or indicator based on predefined quality metrics (e.g., completeness, accuracy).
Requirement: Data Completeness (only internal use, no reporting!!)
[ ] The system shall ensure that all mandatory metadata fields (e.g., title, author, publication date) are present for each dataset and knowledge source.
[ ] Internal administrators/actors shall be able to view a completeness score or indicator for each search result.
[x] The system shall allow users to submit feedback or report incomplete data for improvement.
Requirement: Checks Automatically Persistence of Available Data/Knowledge
[ ] The system shall periodically verify the availability of datasets and knowledge sources in their originating repositories.
[ ] The user shall be notified if a dataset or knowledge source is no longer available, with the option to remove it from the search results.
[ ] The system shall log and report the persistence status of each dataset and knowledge source, providing a history of availability checks.
When providing the search results, the SWR Catalogue will display a dataset or knowledge source to a user and show in which external repositories are, e.g., such a dataset available. The SWR also provides metadata, data & knowledge validation, among others, checks automatically persistence of available data/knowledge (e.g. if a dataset uploaded into Zenodo still exists or if a Web service endpoint is still alive and providing data). https://github.com/soilwise-he/Soilwise-userstories/issues/18
Origin: D1.3 Repository architecture
With acceptance criteria:
Display a Dataset
[x] Users can search for datasets using keywords, filters (e.g., date range, author), and advanced search options.
[x] Search results for datasets are displayed within 2 seconds for typical queries.
[x] Datasets are displayed in a user-friendly format with information about title, source, publication date, and summary.
[x] The user shall be able to view detailed information of the dataset by clicking on the search result, which opens a detailed view page.
[x] Users can sort datasets by columns (e.g., date, title).
[x] Datasets with more than 50 entries are paginated.
[x] Users can navigate between pages using 'Next', 'Previous', and direct page number buttons.
Display a Knowledge Source
[x] Users can search for knowledge sources using keywords, filters (e.g., publication year, author), and advanced search options.
[x] Knowledge source search results are displayed within 2 seconds for typical queries.
[x] Each knowledge source is displayed with metadata including title, author, publication date, abstract, and source URL.
[x] Users can view the full text of the knowledge source directly or via a provided link.
[x] Knowledge sources are categorized (e.g., articles, books, reports).
[x] Users can filter search results by category.
Show Originating External Repositories
[x] Each document in the search results displays the originating external repository.
[ ] Users can view a list of all external repositories included in the search.
[ ] Users can click on the repository name to view more details (e.g., repository description, URL).
[x] The user shall be able to filter search results by external repositories.
Display Linked Items
[x] Each document displays related documents or references as linked items.
[x] Linked items are displayed as clickable links.
[x] Clicking a linked item shows its detailed metadata and a content preview.
Display Duplicities
[ ] The system automatically identifies and flags duplicate documents in the search results.
[ ] Users are notified of duplicate documents and can choose to view or ignore duplicates.
[ ] The system shall provide an option to merge duplicate entries and keep the most relevant information.
Display Broken Links
[x] The system automatically checks for broken links during data ingestion and periodically thereafter.
[ ] Users are notified of broken links with a clear message and the option to report or ignore the issue.
[ ] Broken links are highlighted in search results, and users can view a list of all broken links.
Data Quality Assurance
[ ] The system performs automatic checks to ensure data integrity (e.g., correct formats, valid values).
[ ] Each document displays a quality score or indicator based on predefined quality metrics (e.g., completeness, accuracy).
Requirement: Data Completeness (only internal use, no reporting!!)
[ ] The system shall ensure that all mandatory metadata fields (e.g., title, author, publication date) are present for each dataset and knowledge source.
[ ] Internal administrators/actors shall be able to view a completeness score or indicator for each search result.
[x] The system shall allow users to submit feedback or report incomplete data for improvement.
Requirement: Checks Automatically Persistence of Available Data/Knowledge
[ ] The system shall periodically verify the availability of datasets and knowledge sources in their originating repositories.
[ ] The user shall be notified if a dataset or knowledge source is no longer available, with the option to remove it from the search results.
[ ] The system shall log and report the persistence status of each dataset and knowledge source, providing a history of availability checks.