atmire / IRUS

Other
2 stars 1 forks source link

IRUS patch does not apply cleanly on DSpace 6.3 #3

Open bbranan opened 4 years ago

bbranan commented 4 years ago

Attempting to apply the patch (https://github.com/atmire/IRUS/compare/dspace_6x%E2%80%A6stable_6x.diff) as described in the documentation: https://atmire.github.io/IRUS/#/prerequisites fails.

Attached is a screenshot showing my attempt at applying the patch to the DSpace 6.3 tag. Looking at the patch itself, it appears to include changes that were made as part of the DSpace 6.3 release. This is probably because changes for 6.3 were applied to the stable_6x branch but not the dspace_6x branch, which hasn't been updated since 2017.

An easier way to see the changes included in the patch: https://github.com/atmire/IRUS/compare/dspace_6x...stable_6x. It's pretty clear that this change set is far larger than it needs to be to add the use of IRUS.

irus-patch-apply-failure

davidatmire commented 4 years ago

Dear Bill,

After a first investigation, it indeed seems that the dspace_6x branch (https://github.com/atmire/IRUS/tree/dspace_6x) refers to the DSpace 6.0 codebase, which is probably why the patch/diff file contains a too large change set.

We could look into updating the dspace_6x branch to refer to the DSpace 6.3 codebase in order for the patch/diff file to apply cleanly on DSpace 6.3. This being said, in order to also cover DSpace 6.0., 6.1., 6.2. (+ 6.4. which should be released in a near future) and eventually customised DSpace 6.x repositories as well, I actually wonder if it would not be more appropriate to update the documentation (https://atmire.github.io/IRUS/#/) with a more detailed procedure enabling users to generate the patch/diff file for all these use cases by themselves. (I still have to confirm that this approach is possible from a technical point of view but, in theory, using a git diff command comparing a given branch (DSpace 6.0, 6.1, 6.2, 6.3, 6.4 or customised DSpace 6.x) with the stable_6x branch should be possible).

I'm of course always happy to read your opinion on this.

With thanks,

David

bbranan commented 4 years ago

Hi David,

Thanks for taking a look at this. I would suggest taking two steps here:

Step 1: Update the documentation to describe how to create the patches for any 6.x version, as you suggested.

Step 2: Follow the patch creation procedure outlined in Step 1 to create and capture a patch for each 6.x version that you want to support then test those patches against the applicable DSpace version to ensure that they apply properly and that the code operates as expected. When you have those patch files created and tested, then post them for download and update the documentation to suggest using the pre-generated patches whenever possible.

This will make it much easier for you to be confident that the patches actually work against the DSpace versions they are meant to support, and that individuals attempting to use the patches don't run into problems in the patch generation process.

If time is limited, I'd suggest focusing first on generating and testing the patch for the latest production release version of DSpace, which means 6.3 now and 6.4 when it comes out.

jonas-atmire commented 4 years ago

@bbranan A bit of an update/background information on the patch creations: The reason why we do not create a per version patch file for this is because the only place where the 'default' code is touched upon is a single pom file that loads in the separated IRUS code.

As mentioned by David, we noticed that the dspace_6x branch referred to 6.0, which is incorrect, it should be the same dspace version as the stable_6x branch. If both branches were 6.0/6.1/6.2, the resulting patch would be the exact same.

That being said: It turns out that we DID have the updated dspace_6x branch with DSpace 6.3 installed on our internal git repository (which we use for development and testing), but this was not pushed at the time of the last stable_6x push. Locally, we created a patch based on this version of the dspace_6x branch, so the minor versions match, and applied it to a few DSpace instances for testing purposes. An example of the 'apply' command on a 6.0 and 6.3 can be found here: 6.0 image 6.3 image

When comparing the newly pushed dspace_6x version, the number of differences is a lot snmaller: https://github.com/atmire/IRUS/compare/dspace_6x...stable_6x

The current patch will work for 6.0 to 6.3. When 6.4 is released, the patch can be applied as is to 6.4 without any changes required.

To avoid confusion, I would advise against having people creating the patch manually, this is not required for this kind of github diff (It would be something entirely different if binaries were involved, which the git diff can't handle as is)

There was a missing step in the documentation that we have added in order to make sure that the flyway migration has run correctly (which handles the addition of a db table. Flyway has an intrinsic check on the 'version' and does not try and run 'older' versions automatically.) :

./dspace database migrate ignored

davidatmire commented 4 years ago

Hi @bbranan,

Have you already had the time to check my colleague's feedback ? In theory, all current DSpace 6 versions (DSpace 6.0 to 6.3) are covered and we will have a look at 6.4 when it is released.

It would be great if we could have your confirmation that the new patch applies cleanly on 6.3.

With thanks,

David

bbranan commented 4 years ago

Hi @davidatmire and @jonas-atmire, thanks for the updates. I'm now able to apply the patch cleanly against DSpace 6.3! I'll be testing the deployed integration soon.

Is there a way to verify that content is being transferred properly when using stats.tracker.environment = test or is this only possible after this has been swapped to production?

jonas-atmire commented 4 years ago

@bbranan It is easiest to check for the logs being sent on page views/bitstream downloads. The configuration of the debug logging of the relevant class "ExportUsageEventListener" is set here:

This will log in 2 parts:

  1. Something similar to the following, noting that the url will be sent to the relevant tracker url
  2. Afterwards, it logs the result (For example if it succeeded)

Another thing that can be done is check with the IRUS helpdesk themselves if they saw the posts coming through, mentioning:

A thing to note is that the sending of the tracker urls is NOT done for admin accounts, so make sure to test this as an anonymous user for good measure.

davidatmire commented 4 years ago

Hi @bbranan,

Just a few more information on top of @jonas-atmire last comment. When stats.tracker.environment = test, stats events are sent to the IRUS test tracker: http://irus.jisc.ac.uk/counter/test

The first part of my colleague's answer is about how you can check in the DSpace logs (on the server where DSpace is installed - [dspace]/dspace/log/dspace.log.YYYY-MM-DD). The best verification that you can make on the end-to-end process it to check with JISC/IRUS directly that the events are correctly received by providing them with the information mentioned by Jonas. Paul is in theory the appropriate contact person for this verification (at least for the UK but he will be able to redirect to the right person if there is another representative for the US).

Speaking about the stats.tracker.xxx configuration, I have just realised that the tracker production URL is configured with https://irus.jisc.ac.uk/counter/ (see https://github.com/atmire/IRUS/blob/stable_6x/dspace/config/modules/stats.cfg) which is originally the correct URL but new country-specific IRUS trackers have been created for the production environment, e.g.: https://irus.jisc.ac.uk/counter/us/ (I assume you will need to use this one) https://irus.jisc.ac.uk/counter/anz/

Should we update https://atmire.github.io/IRUS/#/installation in order to give a procedure to update the production URL when appropriate ?

bbranan commented 4 years ago

Thanks @jonas-atmire and @davidatmire! After applying the patch and deploying I was able to verify in the logs that the expected "Successfully posted" messages were being generated. I've also reached out to have this checked from the IRUS end. Also, thanks for the note about a different URL being used for IRUS US.

I definitely think it would be useful to update the documentation to outline the procedure needed to transition from test to production, including a note about there being multiple URLs and how to determine the right one.

One more question: With the patch in place, what is the recommended method to disable the IRUS functionality (if I need to turn it off for some period of time)? I'm assuming that I'll need to comment out the reference to stats.cfg in dspace.cfg, but it's not clear if there are other steps needed. It wouldn't hurt to include this in the documentation as well.

davidatmire commented 4 years ago

Thanks for your feedback @bbranan !

I have updated the following section of the documentation to cover country/area specific IRUS tracker URLs : https://atmire.github.io/IRUS/#/installation?id=_4-tracker-configuration (I will confirm with JISC that I am not missing any area/country).

Regarding your other question, we have never been confronted before with the use case of an institution having the patch installed and needing to disable it later on. There is currently no clean way to disable the IRUS functionality once the patch is installed. Changing the tracker URL (tracker.produrl and/or tracker.testurl) with a fake URL does work but it is kind of a hack (as it also generates "errors - failed to send URL" in the dspace logs). We could look into adding a tracker.enabled property and updating the code to skip the IRUS stats logging if this property is set to false for example but I would need approval for the extra work from JISC to proceed with this. Do you think I should discuss this with JISC ?

bbranan commented 4 years ago

Hi @davidatmire, the use case I have in mind is to deploy DSpace for everyone that we support with the same code base, meaning that the IRUS tracker patch would be installed, but the capability would be disabled by default. This would make it much easier for us to then enable IRUS tracking for those that choose to use it (and are willing to pay for it) and leave it disabled for everyone else. This allows us to avoid having two independent builds of the DSpace software and having to redeploy a different version if we need to turn this feature on or off. Please do reach out to JISC about this need. I would be happy to contribute to that conversation in any way that is helpful.

davidatmire commented 4 years ago

Hi @bbranan, thanks for the quick reaction. I have reached out to JISC about the ability to enable/disable IRUS tracking at configuration level. I will let you know as soon as I hear from them.

davidatmire commented 4 years ago

Hi @bbranan, This is just to let you know that JISC approved the extra tracker.enabled configuration. It has been implemented and tested on both DSpace 5 and DSpace 6 (local instances) and the changes have been pushed to this GIT repository. The field tracker.enabled has also been added to the documentation: https://atmire.github.io/IRUS/#/configuration

If you have some time to install the new version of the patch and also test the new tracker.enabled property on your end, that would be very nice.

bbranan commented 4 years ago

@davidatmire that's great news, thanks for getting that feature added so quickly! I've put it on my list to test out the updated patch.

alawvt commented 2 years ago

We are trying to set up IRUS tracking and have installed this patch. I suggest adding this troubleshooting information (and any other testing information) to the documentation, https://atmire.github.io/IRUS/#/installation, even if it is just to point to this issue. It is very useful to know where to look in the logs, etc. to see if it is working. Thanks.

davidatmire commented 2 years ago

Thx for the suggestion @alawvt, I have updated the documentation on https://atmire.github.io/IRUS/#/installation with the testing/troubleshooting information from this git issue.

alawvt commented 2 years ago

@davidatmire, thank you very much for updating the documentation with this testing information.

alawvt commented 2 years ago

Turning on Debugging (optional)

If you'd like to try and do some debugging yourself, you can change the DSpace logger settings to DEBUG which will sometimes provide you with more information about the error. To turn on debugging, visit the [dspace]/config/log4j.properties file and do the following:

To enable DEBUG logging in the dspace.log file, change the log4j.rootCategory and log4j.logger.org.dspace settings to DEBUG rather than INFO.

(XMLUI Only) To enable DEBUG logging in the cocoon.log file, change the log4j.logger.org.apache.cocoon setting to DEBUG rather than INFO.

NOTE: You'll need to restart Tomcat after enabling DEBUG mode in the log4j.properties file. WARNING: Make sure to turn off debugging once you are finished. Leaving debugging turned on will cause the log files to grow very large very quickly!

davidatmire commented 2 years ago

@alawvt , thank you for the additional documentation proposal. Even if this is indeed really useful information for debugging purposes, this relates more to general Apache/DSpace debugging than to IRUS-specific debugging. Hence, I would suggest that we keep your comment in this git issue / thread, but do not port it to the IRUS-specific documentation.

alawvt commented 2 years ago

@davidatmire, sure. I added these instructions from Paul, since changing the log level is necessary to see the DEBUG messages in section 8, https://atmire.github.io/IRUS/#/installation.