CopticScriptorium / cts

Coptic Scriptorium's website for reading digitized Coptic texts and CTS URN resolution
http://data.copticscriptorium.org
Apache License 2.0
2 stars 3 forks source link

Pseudoelements in HTML not visualized #24

Closed lukehollis closed 9 years ago

lukehollis commented 9 years ago

Looks like this pseudoelements and other data from the HTML aren't visualizing correctly. Renaming properly.

lukehollis commented 9 years ago

It appears that this is fixed re: https://github.com/CopticScriptorium/cts/issues/45 with trusting HTML as safe. Continuing to test and verify this.

lukehollis commented 9 years ago

Not fixed with rendering <ruby> and <rt> elems via https://github.com/CopticScriptorium/cts/issues/45#issuecomment-84167549

lukehollis commented 9 years ago

HTML now visualized directly from ANNIS via an iframe. Hopefully this can showcase @amir-zeldes student's work on the HTML visualizer a little bit better as well by directly loading Vaadin and ANNIS on each single document page.

amir-zeldes commented 9 years ago

I’m not sure about this as a concept – doesn’t this defeat the purpose of the idea to cache the content? This puts more load on the ANNIS server, is not robust in case of ANNIS server downtime, takes considerable time to load for the user, and is automatically altered whenever something changes in ANNIS. I thought part of the point was to make a conscious decision if/when we want to update HTML (the ‘press play to update’ administration concept, so we can disable update on a corpus if we’re testing something and stick to an older static copy). The idea of storing static HTML was to avoid actively querying ANNIS on user demand – I don’t remember a discussion about reversing this decision. Why is this being done instead of caching as planned? Maybe @ctschroeder https://github.com/ctschroeder can weigh in on this too.

From: Luke Hollis [mailto:notifications@github.com] Sent: Friday, March 20, 2015 19:44 To: CopticScriptorium/cts Cc: Amir Zeldes Subject: Re: [cts] Pseudoelements in HTML not visualized (#24)

HTML now visualized directly from ANNIS via an iframe. Hopefully this can showcase @amir-zeldes https://github.com/amir-zeldes student's work on the HTML visualizer a little bit better as well by directly loading Vaadin and ANNIS on each single document page.

— Reply to this email directly or view it on GitHub https://github.com/CopticScriptorium/cts/issues/24#issuecomment-84195980 . https://github.com/notifications/beacon/ACFlW4lRh7dmJ74GoS_V12wCUWtrU9Pxks5n3KgkgaJpZM4Duldr.gif

lukehollis commented 9 years ago

We can definitely change it back if need be! I was going from @ctschroeder in this comment here: https://github.com/CopticScriptorium/cts/issues/45#issuecomment-84170660 Either way is fine. I was hoping it might showcase your student's work a little better and make the URN resolver more simple for future maintenance and development. It will also increase site performance if we don't want to roll this back.

ctschroeder commented 9 years ago

Yes I agree with Amir. The problem I noticed was that the html visualizations were just fundamentally wrong. Missing big stuff (like the pos tags in the analytic viz). I have no idea what pseudo elements are, but I do know that the html and css should come straight from ANNIS. I do believe I said they should then be cached and not refreshed again until someone requests the viz after the documents are next updated.

I think you and I are in the same page. My main concern was that a)the viz was wrong, and b) I don't want us to have to change anything in Luke's database/resolver about the css everytime a doc is added or updated. This thing should read the html/css from ANNIS and cache it.

@amir-zeldes please correct me if I am misunderstanding.

carrie@carrieschroeder.com Sent from my iPhone

On Mar 20, 2015, at 5:15 PM, Amir Zeldes notifications@github.com wrote:

I’m not sure about this as a concept – doesn’t this defeat the purpose of the idea to cache the content? This puts more load on the ANNIS server, is not robust in case of ANNIS server downtime, takes considerable time to load for the user, and is automatically altered whenever something changes in ANNIS. I thought part of the point was to make a conscious decision if/when we want to update HTML (the ‘press play to update’ administration concept, so we can disable update on a corpus if we’re testing something and stick to an older static copy). The idea of storing static HTML was to avoid actively querying ANNIS on user demand – I don’t remember a discussion about reversing this decision. Why is this being done instead of caching as planned? Maybe @ctschroeder https://github.com/ctschroeder can weigh in on this too.

From: Luke Hollis [mailto:notifications@github.com] Sent: Friday, March 20, 2015 19:44 To: CopticScriptorium/cts Cc: Amir Zeldes Subject: Re: [cts] Pseudoelements in HTML not visualized (#24)

HTML now visualized directly from ANNIS via an iframe. Hopefully this can showcase @amir-zeldes https://github.com/amir-zeldes student's work on the HTML visualizer a little bit better as well by directly loading Vaadin and ANNIS on each single document page.

— Reply to this email directly or view it on GitHub https://github.com/CopticScriptorium/cts/issues/24#issuecomment-84195980 . https://github.com/notifications/beacon/ACFlW4lRh7dmJ74GoS_V12wCUWtrU9Pxks5n3KgkgaJpZM4Duldr.gif

— Reply to this email directly or view it on GitHub.

ctschroeder commented 9 years ago

Part of the problem, as I mentioned to Luke, is that I am not conversant with the technical terms. No clue what pseudo elements are and iframes. I am trying to be clear here: the viz need to come directly from ANNIS but then be cached. These tickets need to be more descriptive to avoid misunderstanding.

My comments never said to reverse the decision about caching; only to get the html and css directly from ANNIS w/o modifications required on this end. If we need to change the layout/design of the website, that's fine. But no mods and yes caching are essential.

carrie@carrieschroeder.com Sent from my iPhone

On Mar 20, 2015, at 5:21 PM, Luke Hollis notifications@github.com wrote:

We can definitely change it back if need be! I was going from @ctschroeder in this comment here: #45 (comment) Either way is fine. I was hoping it might showcase your student's work a little better and make the URN resolver more simple for future maintenance and development. It will also increase site performance if we don't want to roll this back.

— Reply to this email directly or view it on GitHub.

lukehollis commented 9 years ago

Rolled back the iframe change to the cached HTML. I'm trying to make these tickets as descriptive as possible and apologize for any miscommunications. All HTML is taken straight from ANNIS as is. There are no changes to the HTML the URN resolver caches.

ctschroeder commented 9 years ago

Viz are now inaccurate again. Please fix and compare each viz in each corpus to what you see in ANNIS to ensure accuracy.

carrie@carrieschroeder.com Sent from my iPhone

On Mar 20, 2015, at 5:40 PM, Luke Hollis notifications@github.com wrote:

Rolled back the iframe change to the cached HTML. I'm trying to make these tickets as descriptive as possible and apologize for any miscommunications. All HTML is taken straight from ANNIS as is. There are no changes to the HTML the URN resolver caches.

— Reply to this email directly or view it on GitHub.

lukehollis commented 9 years ago

The features that are missing from the HTML visualizations in the URN resolver are added from the dynamically generated CSS in the head element of each visualization page and not in the HTML: inline_css

To display the HTML properly, we need to have the dynamically generated styles for each document loaded on the page. I could configure the ingest to cache a copy of these styles with the other HTML and load them on the page as need be, but this is a tricky process because there's no guarantee that the styles won't interfere with each other. We originally weren't planning on ingesting CSS from the ANNIS-generated HTML, but we totally can if need be!

The complications here mean the iframe solution might be a good alternative to caching the CSS along with the HTML, but both are good options. At NPR we developed a responsive iframe solution for scaling the iframe down for mobile browsers if it'd be helpful.

Whatever you both support here (ingesting the custom CSS, using an iframe, or another solution) sounds like a plan to me! Let me know how you'd like to go forward here, and I'll do the best I can to execute.

ctschroeder commented 9 years ago

I don't know what an iframe is. This may need to wait until Amir is back.

Sent from my iPad

— Reply to this email directly or view it on GitHub.

amir-zeldes commented 9 years ago

An iframe is like a litle HTML window inside an HTML page, which can contain an entire HTML page, including the head where the embedded CSS styles are contained. I have less of an issue with the iframe (I know there are some compatibility issues, si we’d have to test that it works on all major browsers as we expect), but my bigger issue is the dynamic load from ANNIS whenever the page is loaded. This could be triggered by multiple users for the same page more or less frequently, but also by search bots, and the loading time seems substantial to me (it takes a while in ANNIS too, but part of the attraction of the static HTML copies was the fact that they appeared instantaneously).

Ripping the CSS out of the head on ingest and creating extra styles in a CSS style sheet is not a bad idea if you want to avoid iframes and can’t reintroduce the embedded styles into the main head (you might be able to do that with JavaScript though?). We can ensure naming conflicts don’t occur by auto-pre-pending all ANNIS derived styles with a prefix (e.g. annis- or something). There are probably several different ways to go about this, but I just want to avoid the dynamic load from ANNIS every time- the interface will be all the more responsive for it.

From: Caroline T. Schroeder [mailto:notifications@github.com] Sent: Friday, March 20, 2015 22:43 To: CopticScriptorium/cts Cc: Amir Zeldes Subject: Re: [cts] Pseudoelements in HTML not visualized (#24)

I don't know what an iframe is. This may need to wait until Amir is back.

Sent from my iPad

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/CopticScriptorium/cts/issues/24#issuecomment-84222453 . https://github.com/notifications/beacon/ACFlW_xUaIZbBkfgx5aM82FCzWdivY7Hks5n3NJLgaJpZM4Duldr.gif

lukehollis commented 9 years ago

Okay, pulling the CSS out of the head and ingesting it with the HTML sounds like a plan! If we start having a lot of conflicting styles in the CSS, we can revisit this for namespacing the CSS further. If iframes seem like a good option sometime in the future, we can always add them in later.

ctschroeder commented 9 years ago

Thank you Amir. I am in total agreement: html needs to cache and needs to viz correctly. If I understand you correctly, there are ways to do this without having to manually fuss with the css every time you load a new doc or change something in the original ANNIS corpora. (At least, that is how I am interpreting "ripping" and using JavaScript. Those are my priorities: caching, accuracy in viz, no manually recreating/editing the css every time we invest.

carrie@carrieschroeder.com Sent from my iPhone

On Mar 20, 2015, at 8:42 PM, Amir Zeldes notifications@github.com wrote:

An iframe is like a litle HTML window inside an HTML page, which can contain an entire HTML page, including the head where the embedded CSS styles are contained. I have less of an issue with the iframe (I know there are some compatibility issues, si we’d have to test that it works on all major browsers as we expect), but my bigger issue is the dynamic load from ANNIS whenever the page is loaded. This could be triggered by multiple users for the same page more or less frequently, but also by search bots, and the loading time seems substantial to me (it takes a while in ANNIS too, but part of the attraction of the static HTML copies was the fact that they appeared instantaneously).

Ripping the CSS out of the head on ingest and creating extra styles in a CSS style sheet is not a bad idea if you want to avoid iframes and can’t reintroduce the embedded styles into the main head (you might be able to do that with JavaScript though?). We can ensure naming conflicts don’t occur by auto-pre-pending all ANNIS derived styles with a prefix (e.g. annis- or something). There are probably several different ways to go about this, but I just want to avoid the dynamic load from ANNIS every time- the interface will be all the more responsive for it.

From: Caroline T. Schroeder [mailto:notifications@github.com] Sent: Friday, March 20, 2015 22:43 To: CopticScriptorium/cts Cc: Amir Zeldes Subject: Re: [cts] Pseudoelements in HTML not visualized (#24)

I don't know what an iframe is. This may need to wait until Amir is back.

Sent from my iPad

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/CopticScriptorium/cts/issues/24#issuecomment-84222453 . https://github.com/notifications/beacon/ACFlW_xUaIZbBkfgx5aM82FCzWdivY7Hks5n3NJLgaJpZM4Duldr.gif

— Reply to this email directly or view it on GitHub.

lukehollis commented 9 years ago

I think that we can hit all those priorities! I added the styles tag to the ingest logic and will ensure the pseudoelements display properly as soon as the next ingest completes.

lukehollis commented 9 years ago

Okay, all the CSS from the style elements in the document head for each visualization are ingested, and the pseudoelements are display correctly. Thanks to you both for the brainstorming and help problem-solving here! Fortunate to be working with you both!

ctschroeder commented 9 years ago

Ok, if you're caching as directed by Amir, then great. I can't see how it's working in the background. Just know what it looks like up front. Am testing lots of different viz now. A couple of notes: The Besa corpus diplomatic visualizations are off in the urn application. The app does something strange with the columns a couple of pages into Thieving Nuns. Analytic viz not working in Fox Diplomatic viz not working in Fox The analytic viz for A22 needs to be added -- it is there in ANNIS, but not in the "document view" in ANNIS right now. You can only find it by doing a search and then seeing the list of visualizations pop up under each search result.

ctschroeder commented 9 years ago

someone else who is more conversant with the tech needs to check this to close it

amir-zeldes commented 9 years ago

Visualizations with pseudo elements look OK to me, closing.