BCcampus / pressbooks

Pressbooks – the CMS for books
https://pressbooks.org/
GNU General Public License v3.0
0 stars 0 forks source link

ensure import routines include glossary #37

Open bdolor opened 6 years ago

bdolor commented 6 years ago

With the introduction of a new post type we need to ensure the import routines are capable of bringing over the data.

How do we ensure that a shortcode in the content area in the old book that was referencing a glossary term with id='99 is still referencing a glossary term that exists in the new book?

bdolor commented 6 years ago

XML will bring in glossary terms if they exist via https://github.com/BCcampus/pressbooks/commit/a64135176c60cd16be2a5df525a5a8b161adfc35

Including glossary terms in the cloning operation is being handed off to the PB team.

All other import items will bring glossary items if they are represented as html in a page but will not bring in the glossary terms, since they aren't represented anywhere in an epub, docx, or odt file as they are in xml or via the api. Work on using shortcodes to facilitate import has some momentum...https://github.com/pressbooks/pressbooks/commit/330bee80b344be7a3d5f73c52fd7087672d07d57 will discuss with PB team to understand if including [pb_glossary] shortcode fits in with the use case that class-complex was intended to address.

@josieg - let's look at this ticket as done for now. Please verify that XML imports do bring in glossary terms. Feel free to verify that other export file import the HTML display of glossary terms, but know that if it's HTML, the answer is likely yes, it works.

josieg commented 6 years ago

This works. I tested importing via XML, epub, and html files. XML brought in glossary terms. The epub and html brought in the html.

A few questions which might not be solvable:

  1. What if someone tries to import a book with glossary terms via an XML file into their existing book that also has glossary terms? These books have totally different glossary terms, but two of the glossary terms have the same ID.
  1. When importing via an XML file, I am give the option of what pages I want to import. The glossary terms all appear as individual "Glossary" items with no label describing what term is which (See screenshot). This makes it impossible to import just one term, it would either be all or nothing. This could be a problem if a person only wants to import a small section of the book. Can we add the glossary term titles to these? glossaryimport

  2. Also, what if I don't select any of the glossary terms when importing, just the chapters. The shortcode with the glossary ID within the content is still imported, but now there is no corresponding glossary term. Does this have the potential to cause problems or will it just make the editor view look messy? Here is a book where I did this in: https://pressbooksdev.bccampus.ca/testingimpotwglossary/

bdolor commented 6 years ago

Thanks Josie - number 2 is the most relevant. Will look to address that and then provide responses to the others once #2 is complete.

bdolor commented 6 years ago

once PR https://github.com/BCcampus/pressbooks/pull/47 is merged, it will take care of 2. To address 1 re: glossary terms being imported into a book with glossary terms and their being an ID collision. For xml import anyways, an ID collision will never happen. Similar to how chapters are created on import...it creates a new post for every new glossary term. The act of creating a new post creates a new unique ID, therefore the relationship between the old ID and the new ID is broken. Everything except the id of the old glossary term is carried over is another way of putting it.

The above process describes the challenge of number 3. Similar to how images have to be 'scraped and kneaded' during the import process, a similarly aggressive routine might be considered for glossary term reference found as [pb_glossary id='32']apple[/pb_glossary] in the content area. For some added complexity there's no guarantee at the point of the import process that the combination of glossary term and corresponding chapter (or vice versa) will be selected. Nevertheless, looking for shortcodes in content and transforming them to meaningful html is the point of the the new class-complex.php but AFAIK hasn't been implemented yet. It would seem that adding [pb_glossary] to class-complex.php would be reasonable, but I have to confirm with them. Plus, they might want to do it. At any rate, it would only work when importing from xml or or the api when glossary ID's refer to something real. In xhtml or any flavour of html (besides web), a reference to a glossary id is meaningless/doesn't go anywhere. Hope that helps.

bdolor commented 6 years ago

@josieg - can be (re)validated

josieg commented 6 years ago

Looks great!