Closed kanasznagyzoltan closed 1 year ago
@kanasznagyzoltan : I'm not sure I understand the use case for this.
The solr-export-statistics
command is a complete dump of all statistical information in your Solr statistics
index, and based on the number of records in that index, it may create several dump files (looking at SolrImportExport
it appears it creates a new file for every 10,000 rows).
So, I'm worried moving this to the UI would be harder than it seems....when the process starts, you don't know how many files will be included in the export, nor do you know the size of the export (It will be large, as you are basically creating a full copy of all the data in your statistics
index). In my opinion, that makes this command risky to make available in the UI as it currently works. For instance, it'd be bad if the Admin triggers this command from the UI, only to find it exports so much content that the server runs out of storage space. Whereas, if you are on the commandline, you can determine how much storage space you currently have available (and how much the statistics
core seems to be using) before proceeding.
All in all, I'm worried this command won't be simple to move to the UI. It'd need reworking so that there were protections in place to ensure you aren't using all your disk space, and to make sure the resulting export is downloadable via the web (as the export may consist of a number of large files). Those are my initial thoughts here, but I'd welcome other developers to add their opinions (especially if I'm overlooking something)
@dsipos-dev Could you write some info what and how we have done this? Thank you!
statistics
core contains cases not statistics.bin/dspace whatever
there. Working locally to the server, one can do things like mount the ultimate destination of the data as a network share and send the report directly to where it is wanted, without consuming server storage. Hiding this behind the GUI is very limiting.What is the use case here?
Flagging as needs discussion
and moving into our unscheduled backlog, as the use cases are still unclear here.
Closing, outdated and no use case provided
Some more thoughts:
You can get the usage records directly from Solr. You can select just the records that you want using Solr's query language.
Not everyone should have access to these data. You can secure access to the Solr port independently of access to DSpace, and you can create accounts in Solr independently of DSpace accounts. You can (and arguably should) run Solr on a different host altogether, behind a corporate firewall that makes it invisible to The World.
Solr already provides a way ("cursormark") to deal efficiently with huge quantities of records, which is exactly what usage recording produces. No temporary files are involved, so there is no reason for concern about server storage.
At the moment the you can only export the usage statistics using the DSpace command line tool.
This issue is purely a feature request, which could enable the admin user to export & download the usage statistics using the angular frontend. The angular frontend already has a submenu which can be extended for this purpose. I attached a screenshot how we could do the feature addition:
My colleague @dsipos-dev could do the implementation. We have tried to estimate the required work to do this. As far as we know mostly it will require backend implementation. The frontend UI is rendered using the backend configuration.