gbv / coli-ana

API to analyze DDC numbers
https://coli-conc.gbv.de/coli-ana/app/
MIT License
2 stars 0 forks source link

Check Pica3 representation of table numbers #79

Open stefandesu opened 5 months ago

stefandesu commented 5 months ago

Fixing the DDC notation pattern caused the Pica3 representation to change. For our monitoring example, it changed from

5420 [23]700.90440747471-G--7-T1--09044-T1--074-T2--7471$Acoli-ana

to

5420 [23]700.90440747471-G--7-T1--09044-T1--0901-0905%3A074-T2--7471$Acoli-ana

The included numbers changed slightly (even though the actual result on the web UI did not change), and it now includes a table number with : which is encoded here as %3A.

Note that I will change the monitoring for now so that it won't bug us about this. I might change it back as soon as this is fixed.

stefandesu commented 5 months ago

I think it's an encoding issue with the : in the notation. When comparing the JSON:

❯ diff 0.4.1.json 0.5.1.json                                         
183c183                                                              
<         "uri": "http://dewey.info/class/1--0901-0905:07/e23/",     
---                                                                  
>         "uri": "http://dewey.info/class/1--0901-0905%3A07/e23/",   
185c185                                                              
<           "T1--0901-0905:07",                                      
---                                                                  
>           "T1--0901-0905%3A07",                                    
193c193                                                              
<         "uri": "http://dewey.info/class/1--0901-0905:074/e23/",    
---                                                                  
>         "uri": "http://dewey.info/class/1--0901-0905%3A074/e23/",  
195c195                                                              
<           "T1--0901-0905:074",                                     
---                                                                  
>           "T1--0901-0905%3A074",                                   
204c204                                                              
<             "uri": "http://dewey.info/class/1--0901-0905:07/e23/"  
---                                                                  
>             "uri": "http://dewey.info/class/1--0901-0905%3A07/e23/"

I will look into it.

stefandesu commented 5 months ago

This is a bug in jskos-tools: https://github.com/gbv/jskos-tools/issues/41

However, it opened up discussion on how to handle these kinds of table range notations in general: https://github.com/gbv/jskos-data/issues/47