textbrowser / biblioteq

Archive and catalog the world for today's and tomorrow's generations! Awesome and everyware.
https://textbrowser.github.io/biblioteq/
Other
217 stars 46 forks source link

CLOSE : not urgent : multi volume import problem with identical isbn-13 #167

Closed meteos77 closed 2 years ago

meteos77 commented 2 years ago

continuation of the subject : 2 books + 1 isbn + 2 access numbers = 1 problem csv import #153

please look at this small detail when you have time : it's not urgent ; todo list :-)

solve the problem of non-import because same isbn-13 for 2 volumes books. 1 - the accession number is different for each volume 2 - the unimarc field 461 $v indicates the position

20_Source_PB_Import_File-doublons-without-column-volume_number

I can create a "volume" field in the catmandu -> csv script to help the import 20_Source_PB_Import_File-doublons-with-column-volume_number

on the other hand I can't know if there will be a conflict because catmandu treats line by line and doesn't know the duplicates.

it's necessarily at the BiblioteQ level that we should create a pseudo isbn-13 to authorize the import

Unimarc : the field 461 and subfield $v indicates the Part / volume number of the catalogued document

461 1 $t Au revoir là-haut $v 1 461 1 $t Au revoir là-haut $v 2 461 1 $t L'aile des vierges $v 1 461 1 $t L'aile des vierges $v 2 ... ... the import file with just the 10 books concerned

multi-volume_import_problem_with_identical_isbn-13.zip

textbrowser commented 2 years ago

We can relax the ISBN rules or add a volume field.

meteos77 commented 2 years ago

for me there are not enough fields in the database

a field with the target audience is also important for me, especially for statistics

meteos77 commented 2 years ago

I will make a point with the volume numbers + unimarc because apparently there are several places where the volume information is given in 5 minutes I have already seen 3 of them: isbn-13 : 978-2864-9701-18 200 $h 225 $v 410 $v

textbrowser commented 2 years ago

I'll finish group returns and if I have enough time before 2022.01.30, we can add new fields.

meteos77 commented 2 years ago

There are already a lot of novelties in the current version. You have to digest all this :-)

in the release notes "Print Icons View" did not appear; I'd rather finish this part for this version. before starting the new fields part.

textbrowser commented 2 years ago

The printing works with the basic rendering functions that are given by Qt. There are other solutions which are not as simple:

meteos77 commented 2 years ago

qt is not perfect :-)

textbrowser commented 2 years ago

It provides the framework and everything else beyond the basic is your responsibility. Printing is a geometric excursion.

meteos77 commented 2 years ago

if it is not in pdf but in image format it is easier? export in png format for example

the most annoying is the layout with 50% lost A4-print.pdf

textbrowser commented 2 years ago

I have not tried an export of the entire scene. Without implementing it, I don't know the results.

textbrowser commented 2 years ago

I do know that Qt has a limit on images.

textbrowser commented 2 years ago

https://github.com/textbrowser/dooble/issues/106

meteos77 commented 2 years ago

see [subject](https://github.com/textbrowser/biblioteq/discussions/175

textbrowser commented 2 years ago

https://www.loc.gov/marc/bibliographic/bd020.html

textbrowser commented 2 years ago

ISBN may also be considered to be application invalid if it is not directly applicable to the bibliographic item represented by a particular record. Application invalidity is usually related to the cataloging treatment employed by a particular agency in terms of the number of records involved. For example, if there is a record for a multivolume set as well as separate records for each of the volumes in the set, the ISBN for the set is considered application invalid on the records for the volumes. Only the ISBN applicable to the entity represented by a particular record is considered valid on that record.

textbrowser commented 2 years ago

If you have a multi-volume set and there is one ISBN assigned to the set, one ISBN is presented.

If you have a multi-volume set and there are ISBNs per each item, the set is considered several items.

A new non-essential field may be necessary to create relaxed relationships between individual records. For single multi-volume sets this is not necessary as the description suffices. For example, the description can contain information about each volume in a set. For multiple multi-volume sets (each volume has a unique ISBN), BQ does not strictly require a change. However, a new non-essential field may allow for establishing relationships between database records.

meteos77 commented 2 years ago

in this ticket there are 2 problems

1 - importing multi-volume documents with only 1 isbn for all volumes.

2 - the management of documents forming a multi-volume collection

I join you a test base for the point 2 multi_volume_with_different_isbn.sqlite.tar.gz :

meteos77 commented 2 years ago

in unimarc sudoc (fr) in the cycle "THE ROMANS" of Max Gallo, there are 5 documents with 5 different isbn the fields 225 $a or 461 $t indicate the name of the collection the fields 225 $v or 461 $v indicates the volume number

meteos77 commented 2 years ago

in marc21( documents in English version + search on Z39 Library of Congress the field 490 1 $a The stone of light indicates the name of the collection the field 490 1 $a The stone of light $v v. 1 indicates the volume in the collection

meteos77 commented 2 years ago

concerning the import of multi-volume with only 1 ISBN; in my 1st message I had put some test files

meteos77 commented 2 years ago

I discover the history of the 3 suns - trisolarians :-) notices_navette_2022_02_02.pan_unimarc.tar.gz

yaz-marcdump -f ISO5426 notices_navette_2022_02_02.pan | grep 461

textbrowser commented 2 years ago

Would you like a new database field for sets which have individual books having unique ISBNs? How about "multivolume_set_isbn"?

meteos77 commented 2 years ago

for documents with identical isbn (the unimarc field 461 $v: indicates the part number of the document) so an additional field allowing to distinguish the number should allow during the import to override the unique isbn-13

exemple : the last column contains the field 461$v 20_Source_PB_Import_File-doublons-with-column-volume_number.csv

textbrowser commented 2 years ago

I'm looking for a yes or a no answer. :P

meteos77 commented 2 years ago

yes for the new field and yes for the heading :-)

textbrowser commented 2 years ago

The ticket's heading?

meteos77 commented 2 years ago

ok for multivolume_set_isbn another word from the translator: entitled

textbrowser commented 2 years ago

https://github.com/textbrowser/biblioteq/discussions/191

meteos77 commented 2 years ago

if you create games of tracks all the clues are in the release notes :-)

textbrowser commented 2 years ago

Import remains? Do you have a plain summary of the import change(s)?

meteos77 commented 2 years ago

20_Source_PB_Import_File-doublons-without-column-volume_number.csv

Rapport-import-multivolume

textbrowser commented 2 years ago

OK. That's very descriptive.

meteos77 commented 2 years ago

I did not understand your question

meteos77 commented 2 years ago

Does the import remain? for me the import tool does its job normally, line 2 has an isbn A line 3 has an isbn A so it rejects it line 4 has an isbn B line 5 has an isbn B so it rejects it because line 3 and 5 the isbn are not unique

meteos77 commented 2 years ago

in my first message I had suggested to rely on the unimarc field 461 $v which indicates the volume number

if isbn A has as value in 461 $v equal to 2 it means that we have a book in several volumes but which can have an identical isbn

textbrowser commented 2 years ago

The new multi-volume field now exists. How should the import behave?

meteos77 commented 2 years ago

how to manage the copies when importing csv ?

meteos77 commented 2 years ago

BiblioteQ analyzes the number of copies field and creates 2 documents and assigns 2 barcodes X-1 and X-2 this looks like the problem the only thing that is different with multi-volume books 1 - the field 461 $v 2 - the barcode of the library (accession_number)

meteos77 commented 2 years ago

I hope to be understandable

if BiblioteQ has a column Z in the import (field 461$v ) with the number 1 -> normal import

if the Z column has a number different from 1 then biblioteQ understands that it must create 1 copy for the number 2 create 2 copies for number 3 ...

it places as barcode of the copy the field accession_number column A and in the isbn of the document it activates Multi-volume-Set ISBN

textbrowser commented 2 years ago

20_Source_PB_Import_File-doublons-without-column-volume_number.csv

Rapport-import-multivolume

The CSV contains items having non-unique ISBNs. The import does not look ahead to determine if something is a copy. Copies are indicated by the quantity field.

You're asking the import to look ahead, make a guess, and assume that the second (or third or forth or nth) item (having the non-unique ISBN) is in fact a copy. That's silly. That's a broken CSV.

textbrowser commented 2 years ago

Copies in a library have the same ISBN. Copy 1's ISBN is not -1 and copy 2's ISBN is not -2. An ISBN is assigned to an item. Two copies of said item have the same ISBN. Copies have other criteria to differentiate them. Not ISBNs. Broken CSV.

meteos77 commented 2 years ago

no because I propose to add the Z column with the report of the 461$v field

textbrowser commented 2 years ago

And the CSV is broken because it requires BQ to look ahead and compute your intentions. Is this a copy? Why is it a copy? What if it isn't a copy and the CSV is actually mistaken?

textbrowser commented 2 years ago

No one knows what 461$v means. That's also silly.

textbrowser commented 2 years ago

The export and the import are inverse functions of one another. If the import is something else, than it's another thing. Export(Import(CSV)) = CSV. You don't have this with this file.

textbrowser commented 2 years ago

When you export a view, the export does not include copy information. It isn't an export of a database.

textbrowser commented 2 years ago

Whereas the import (in your example) is like a database in the form of a CSV.

textbrowser commented 2 years ago

Typically, people pay a lot of money to take their databases (in whatever form they are) and be migrated to other databases. Importing is hard work. Now, if you had INSERT statements it would work. You're asking the import tool to be extra smart for specific use-cases.

textbrowser commented 2 years ago

Actually, you're asking the import tool to treat the CSV like a database.

textbrowser commented 2 years ago

Which is silly and annoying. Anyway, I'll think about it. From my perspective. This work is repetitive and not interesting. From your perspective, the tool has to import whatever you feed it.