Open djplaner opened 2 years ago
But the issue is earlier thre are multiple span class="canvasFileLinks"
<p>Lecture audio file: theme 1</p><ul><li>
<span class="canvasFileLink">HSY 1. </span>
<span class="canvasFileLink">Polities.(</span><span class="canvasFileLink">1).m4a</span></li></ul><p>Lecture 1 powerpoint</p><ul><li> <span class="canvasFileLink">HSY_1_Polities.pdf</span></li></ul>
Problem is happening earlier in Word/mammoth conversion - checkHTmlView gives
<span class="canvasFileLink">HSY 1. </span>
<span class="canvasFileLink">Polities.(</span>
<span class="canvasFileLink">1).m4a</span>
Question Is Mammoth donig this because both P and R styles are configured for canvasFileLink
Renee Denham 27/09/2022 4:50 pm
Hi David, think I've found a bug 📷 📷 📷 it's 1 link where the text is 3 words, but is instead becoming a link per word.
Yep, that's a bug alright. A known one. The second one I mentioned in a message in the AEL LMS Migration chat. I've not had the time to fix it or even properly diagnose it (see previous complains about limited time). What I know about it follows. In fact, I've dug out the github issue on this - a description of the problem, no fix. (But I will add this explanation to help prompt me)
I remain uncertain about the actual source. It's some combination of
How the CAR process generates the Word document (using a specific Python HTML2Docx module) How word2canvas converts the Word back into HTML (using Mammoth) Somewhere in there a single style applied to a sequence is being separated into multiple. Perhaps when the Canvas File Link style is a linked style. Perhaps for these spaces are created as separate items by Mammoth.
I can see three possible solutions
Fix any problem in the CAR process - not the soln I think Configure Mammoth not to have this problem
Fix it after the fact
For an example of #3 see the postConvert function in c2m_wordConverter.js. There's a "remove any links with empty innerText".
A quick fix might be to detect such sequences of links and fix them. The challenge may be when there is meant to be a sequence of such links. Suggesting the code would have to try to distinguish what is a meaningful file name and what isn't. Not straight forward. Suggesting a need to try a #2 solution - or some combination.
FWIW, I've done a bit more testing (arose out of migration work I'm focusing on). I'll be adding this to the github issue and trying to get to it at some stage. Renee, if you can, could you share a copy of the Word doc you had the problem with? I'm assuming it was taken from the CAR? Your document and the following should help with figuring out a kludge solution, if not an actual solution.
The attached Word doc (created by hand) generates this page - see image below. The two problem links were generated by adding in some text after the initial application of the style (the a and hello).
I then modified the document by copying and pasting the "hello" link and reapplying the Canvas File Link style. REsulting in this page. i.e. that re-application fixed the issue.
What is the HTML in postConvert
.
There are six visible Canvas File Link elements in the Word document. Not all of them are links. Some are just the name
McNamara.pdf Visual Analysis of a Photograph Donna K. Reid: Thinking and Writing About Art History Donna K Reid.pdf Donna K Reid.pdf Donna hello K. Reid: Thinking and Writing About Art History
but the HTML says there are 12 .canvasFileLink (showing innerHTML / href )
There may be hope with the tab, but for now my focus was on the canvasFileLink issue. This image has the HTML generated from Renee's Word doc. It mirrors what I see in mine, revealing two possible cases
The empty canvasFileLink. Random space left over (see the last canvasFileLink below just before the embed). This may be due to the way run styles work in Word when you're manually editing. leaving a space at the end.
The multiple sequential spans breaking up a single file name This is what appears to happen when you manually do some edits.
Next step
When converting to HTMl do check for canvasFileLink problems (below) , report the problem, and provide a link to docs explaining how to fix
Perhaps look at removing the empty lone link
Original problem
5251LAW_3218_GC having issues with multiple files in a row. The last file works, but the first 3 do not.
Topic 11A
THe issue appears to be more related to single links ending up with mulitple links in what is meant to be one link