EDIorg / ECC

ECC = EML Congruence Checker
5 stars 0 forks source link

raise level of entityNameLength to "error" #27

Open mobb opened 4 years ago

mobb commented 4 years ago

de facto, this has been true all along - that if the entityName is longer than a certain length, the package cannot be uploaded into pasta. But the check is classified as a warn, implying that the package can go in. This has happened a few times.

Our original premise was that the only ECC-violations that caused "error" (and denial of upload to pasta) were those that meant the data table itself was unusable. And technically, an entityName that is too long does not fit that. But the pattern we have seen in data packages is the same string in entityName and entityDescription -- which is bad practice.

Comment here on whether you think it's OK to elevate the status of this check to "error"

atn38 commented 4 years ago

I wasn't aware of this check. What is the limit?

On Fri, Nov 15, 2019 at 1:49 PM mobb notifications@github.com wrote:

de facto, this has been true all along - that if the entityName is longer than a certain length, the package cannot be uploaded into pasta. But the check is classified as a warn, implying that the package can go in. This has happened a few times.

Our original premise was that the only ECC-violations that caused "error" (and denial of upload to pasta) were those that meant the data table itself was unusable. And technically, an entityName that is do long does not fit that. But the pattern we have seen in data packages is the same string in entityName and entityDescription -- which is bad practice.

Comment here on whether you think it's OK to elevate the status of this check to "error"

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EDIorg/ECC/issues/27?email_source=notifications&email_token=AKAZD5Q6O5JPYX7QSISJ5XLQT34NTA5CNFSM4JN7OX62YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HZWLSIQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKAZD5V7COSPV2E3AWEQDUDQT34NTANCNFSM4JN7OX6Q .

gastil commented 4 years ago

Good to catch that error with the ECC before it hits pasta and causes pasta grief. I approve. The limit of 256 characters is obviously way bigger than best practice. But our logic of being able to set this an error rather than a warn, at least for the present, allows us to do this with the 256 char limit. In future I suggest we tighten that to perhaps 100 or 150 characters. But that tighter limit would have to go thru the approval cycle and could only be argued to be a warn since it does not prevent data use.

Does the checker explanation to the user mention that entityName should not be identical to entityDescription? We discussed adding a category between info and warn, maybe alert. Would crimes against BP be worthy of alert? ;)

mobb commented 4 years ago

alert issue is #23

mobb commented 4 years ago

please confirm, @gastil, what you think we should do for entityName, now:
a) set to error and set limit to 256 b) set to error and leave limit at 100 (current limit for a warn) c) leave as is (a warn, which allows pasta to blow up)

If you want a new check, make a new issue

gastil commented 4 years ago

I vote (a) now, imediately and (b) in 6 months after the procedural notification of the IMC.

srearl commented 4 years ago

Elevating this to error seems like a prudent change. This will aid, as suggested, eliminating any confusion about including description details in entityName, and, unusual but possible, including data in that element. I include hashes in my file names so would be quite opposed to a length limit less than 100 characters. I can comment on the other issue but will add here that I very much do NOT like the idea of adding alert as a new level. info, warn, and error make perfect sense and I think adding alert is only going to add complexity and confusion.

mobb commented 4 years ago

For future reference, checks adhere to the BP here: https://ediorg.github.io/data-package-best-practices/EMLmetadata/entity-datatable-spatialraster-spatialvector-storedprocedure-view-otherentity.html

For how to use entityName, objectName and entityDescription, look about half way down the page. This link is a little clunky right now, it will be edited when I get the section headers sorted out (probably next week)

srearl commented 4 years ago

this is an aside but interesting would be to do (1) an analysis of how ent names currently residing in pasta are treated, and (2) whether data users find that they provide any value beyond the entity description.