Capitains / MyCapytain

Texts API and Textual Resources Utility Library for Python 3
http://mycapytain.readthedocs.org
Mozilla Public License 2.0
8 stars 9 forks source link

Add XML collections classes for new CapiTainS guidelines #198

Open sonofmun opened 4 years ago

sonofmun commented 4 years ago

Basically classes parallel to those in MyCapytain.resources.collections.cts

sonofmun commented 4 years ago

As far as I can see, the classes we will need will be for readable (I see no difference among editions, translations, and commentary that would require different sub-classes for these), work, and textgroup.

PonteIneptique commented 4 years ago

Hmm, you should only have two classes, maybe three.

sonofmun commented 4 years ago

I can agree with that. The question then is should the metadata for a collection be read all the way down to its leaves? I.e., if the collections that are members of the collection that I am calling refer to non-readable collections with their own external __capitains__.xml, should that external xml file also be opened and the metadata and members extracted from that file? Or should only the information in the metadata file for the requested collection be shown? My feeling is that it would be better to go all the way to all the leaves, but I worry that this may be too resource intensive for large collections.

PonteIneptique commented 4 years ago

I need to think about it, my worry is how do you know the id of the children collection if you do not parse them (right now I think this is not hold in the schema)

sonofmun commented 4 years ago

From what I see in the RNG in the guidelines, the identifier for the child collection must either be as an @identifier attribute on the <collection> element if it is a remote collection or as an <identifier> sub-element on the <collection> element if it is a local collection. So the ID should be available. But getting the IDs of the grandchildren, etc., would require parsing the 'remote' file.

PonteIneptique commented 4 years ago

That's already the case with Capitains resources. It's the parsing using the local resolver that connects everything. For remote, I think the point is to give access to the id, but not actually retrieve the data. Though, maybe we should add a dc:title possibility for remote :/

sonofmun commented 4 years ago

That seems less than ideal. If the remote source changes the title or something else, then it doesn't come over. I guess if the idea is that the user will always want to use their own title for remote resources, then that would be OK. Otherwise, best practice would say to get the metadata from the remote source. Perhaps we could add a parameter getremote so that the user can choose whether to follow the tree all the way to the leaves.

PonteIneptique commented 4 years ago

The question might actually find its answer in our understanding of remote collections. This was meant as a way to express link to the outside, but in no case does it expect a specific format there. The issue with parsing remote is you would have to know what kind of content you'll find (CTS API? Dts? Wii source ?) and that would prove difficult to do.

Le mar. 12 nov. 2019 à 9:25 AM, Matthew Munson notifications@github.com a écrit :

That seems less than ideal. If the remote source changes the title or something else, then it doesn't come over. I guess if the idea is that the user will always want to use their own title for remote resources, then that would be OK. Otherwise, best practice would say to get the metadata from the remote source. Perhaps we could add a parameter getremote so that the user can choose whether to follow the tree all the way to the leaves.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Capitains/MyCapytain/issues/198?email_source=notifications&email_token=AAOXEZUIRQ3YLEWAKK7RQBTQTJSBJA5CNFSM4JLSELDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDZNVCQ#issuecomment-552786570, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOXEZQ5PFBTFPHKJOOZXULQTJSBJANCNFSM4JLSELDA .

sonofmun commented 4 years ago

True. I guess I was thinking in terms of DTS there. Is it then expected that the provider of this remote source will give some kind of instructions for how to parse their metadata files? Or is remote meant to be completely independent of MyCapytain and Capitains in general?

PonteIneptique commented 4 years ago

It's supposed to be indépendant. Implementation might make use of it, but more as an external link right now than as a supposedly internal resource.

Le mar. 12 nov. 2019 à 1:40 PM, Matthew Munson notifications@github.com a écrit :

True. I guess I was thinking in terms of DTS there. Is it then expected that the provider of this remote source will give some kind of instructions for how to parse their metadata files? Or is remote meant to be completely independent of MyCapytain and Capitains in general?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Capitains/MyCapytain/issues/198?email_source=notifications&email_token=AAOXEZVBQA6IGCY5ONUBBQTQTKP3TA5CNFSM4JLSELDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED2DU2Q#issuecomment-552876650, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOXEZU77LOO74JXE6ZJK53QTKP3TANCNFSM4JLSELDA .

sonofmun commented 4 years ago

OK. If the link is not a remote link? Should we go all the way to the leaves here, i.e., to all the readable descendants of the collection?