clarin-eric / resource-families-issues

4 stars 0 forks source link

HindEnCorp 0.5 #47

Closed jakoble closed 4 years ago

jakoble commented 6 years ago

http://hdl.handle.net/11858/00-097C-0000-0023-625F-0

stranak commented 5 years ago

File descriptions (metadata) clearly say this.

If you need to go to the original repository (i.e. don't see everything in VLO) I am hesitant to consider this a real error.

twagoo commented 5 years ago

@stranak issues reported in this context are not necessarily indicative of errors in the original metadata. Although admittedly the label metadata issue could be taken to imply that :)

In this case however I think the solution lies in resolving VLO issue #25. Taking that into account, closing this issue is probably justified.

twagoo commented 5 years ago

Ah, looking at the case again, I believe that resolving that VLO issue would not improve the situation for this corpus as the metadata presented in the landing page is not present in the metadata as processed by the VLO. My suggestion would be to export the resource descriptions to CMDI.

stranak commented 5 years ago

I am not sure what is the best solution here. For well known reasons we don’t provide bitstreams to VLO. we could of course provide bitstream descriptions nevertheless, but that might be more confusing than helping.

I am biased toward thinking they current situation is OK ant the fact that you should go to the homepage of the resource to find all the information is not an issue, but a feature.

twagoo commented 5 years ago

Arguably... but in that case, from a pragmatic point of view, it would be helpful if the most important information from the resource descriptions would also be included in the top level description. In the current situation, this record cannot be retrieved by searching for, e.g., sentence-parallel.

stranak commented 5 years ago

I am sorry, but I disagree. I understand it is the one piece needed for this particular use case. But it is just one random use case. Somebody else will think some other piece of metadata is the crucial one. Take your pick, which one: https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11858/00-097C-0000-0023-625F-0/README.txt?sequence=1&isAllowed=y

Of course we can invent complex structural metadata to capture all of this, or we can make this size of Description field. I still prefer treating VLO as a rough first information and going to the home page. Cf also discussion at #53

twagoo commented 5 years ago

I can't assess for each bit of information how crucial it is - in principle that is your call. In this context we are looking at resources with the improval of resource retrieval in mind, though. As long as you understand that with every bit of (potentially) relevant information omitted from the CMDI, discoverability of the record in the VLO goes down.

stranak commented 5 years ago

OK, I added sentence-parallel into description and into keywords (dc.subject).

Discoverability is for a bigger discussion. Can you find out for instance how many visits did VLO direct to us (the landing page) for this resource?

twagoo commented 5 years ago

Indeed, probably should be discussed in a context with a less specific scope.

But to answer the question: yes, although no exact numbers as we need the user's consent in order to record that information. Normally the referrer should also be distinguishable at the targeted location although there are some technical constraints related to data privacy (that have recently been discussed on Slack).