textbrowser / biblioteq

Archive and catalog the world for today's and tomorrow's generations! Awesome and everyware.
https://textbrowser.github.io/biblioteq/
Other
213 stars 44 forks source link

Z3950 Unimarc Analyser: Fields Publisher / Place of Publication and Date (210 and 214) #343

Closed meteos77 closed 9 months ago

meteos77 commented 9 months ago

Status of Z3950 Unimarc analyzer before 2019 test Z3950 BnF (France) isbn13 : 978-2355-0442-98

publisher and place of publication fields are filled in correctly 210$a -> Place of publication 210$c -> Publisher

date does not work because "DL" in front of it 210$d -> Publication date

Situation for Z3950 Unimarc scanner after 2019 test Z3950 BnF (France) isbn13 :978-2749-9349-83 ; isbn13 :979-1036-6069-77

the fields Publisher and Place of publication and Date are not filled in. 214$a -> Place of publication 214$c -> Publisher 214$d -> Publication date

The standard has therefore evolved. Could you please add fields 214 to the analysis made by BiblioteQ for the addition of books by Z3950? Thank you in advance.

meteos77 commented 9 months ago

u_b_214_update2019_online_final.pdf

meteos77 commented 9 months ago

u_b_210_update2019_online_final.pdf

textbrowser commented 9 months ago

BQ downloads the data and you can copy and paste it into the fields. BQ can't adapt to infinite text possibilities as it doesn't have a programmable parser. One could be added and then the expectation becomes that BQ provides the parser too. The standard will always evolve and BQ cannot continue adapting to it. So, copy and paste is your best solution. I'll add a simple tool to other options. Simple and that's that. Beyond that, I won't be forever tweaking parsers. :)

textbrowser commented 9 months ago

I see the DL. Redesign! The $d field may include a c and that's proper. What's DL? Nonsense.

textbrowser commented 9 months ago

The issue with standards is that they are not standard. The date may be "2 0 0 2" and BQ will not comprehend it. See? It appears like it's valid. All of these variations require a strong regular expression.

meteos77 commented 9 months ago

I understand that you can't foresee every crazy case.

What's surprising to me is that the Z3950 server specified is the service of the "French National Library", so it should follow the "standard norm" on its server.

meteos77 commented 9 months ago

Notes on $d Date of publication, production, distribution/dissemination, manufacture, copyright or protection Registration particulars :

textbrowser commented 9 months ago

Dates are now extracted super smartly without a regular expression. Discover the first 4 numbers after $d. Those must compose the date. Now, as BQ is super smart anyway and flexible (and I know most people do not know of this feature), you can modify the downloaded data and command BQ to parse it again. In a way, teach it to work with correct data instead of dictating the software to comprehend endless possibilities. So that's completed; the date.

textbrowser commented 9 months ago

parse-me

textbrowser commented 9 months ago

That's futuristic vision. :P Like, BQ knew that the standards were broken [malleable] and it allows you to adjust data.

textbrowser commented 9 months ago

214 is totally new to BQ and requires new logic.

meteos77 commented 9 months ago

I've just learned how to modify them directly in the Marc tag field :-) it may indeed help, but why hide features (I didn't see anything in the doc)?

textbrowser commented 9 months ago

It's not hidden if you scroll. And I don't document everything because there are too many things.

meteos77 commented 9 months ago

for 214, no need to modify, with the "new functionality" (writing in the marc tag area) 214 -> 210 Analysis button is therefore discovered Editor, and Place + Date

textbrowser commented 9 months ago

Almost done with 214.

textbrowser commented 9 months ago

Committed.

textbrowser commented 9 months ago

2014 works for books.

meteos77 commented 9 months ago

Super, thanks for the modification : 210 and 214 work fine.

I know you don't like to touch the parser because it's repetitive programming, But as an end user, a program that with just the isbn + a click allows you to have all the information pre-filled is really GREAT! With the favorite feature of enumerations for language and currency unit. 2 fields are automatically filled in.

textbrowser commented 9 months ago

214;$a;$b;id.publisher # Set id.publisher to $a@214. Retrieve from $a to $b. 214;$a;id.publisher # Set id.publisher to $a@2014. Retrieve from $a to the end of 214.

A descriptive language like this, or better, would allow BQ to remain future-proof. That is, zero source modifications. Another idea would be allowing for JavaScript to parse the query data. This requires knowledge from the people so it's a lot more difficult.

meteos77 commented 9 months ago

I hope they don't change the numbers of the fields (constaments and that 210 -> 214 is an exception).

Your example is certainly usable for a simple user, as opposed to modifying the .cpp file.

The work involved in creating your language description seems enormous compared with the modifications you make to the .cpp file?

textbrowser commented 9 months ago

Yes, it will be more work now. But we must live in the present and in the future. Plan for the unknown.

textbrowser commented 9 months ago

Configurable software is a form of programmable software. Software which allows you to execute another language so that it can be configured is initially expensive but allows you to customize it according to the richness of the language.