hwgilbert16 / scholarsome

Web-based interactive flashcard learning software
http://scholarsome.com
GNU Affero General Public License v3.0
492 stars 26 forks source link

Importing Quizlet sets containing semicolons ; yields errors #74

Closed RannyBergamotte closed 9 months ago

RannyBergamotte commented 9 months ago

If trying to import a Quizlet set that has semicolons in the term or definition, it yields a set not formatted correctly error. I don't know if there is a way to detect when a term or definition stops or ends, but I guess, the solution is to just go through the set and remove semicolons.

I guess also maybe setting custom breaks could work too, or maybe even let the user decide what the break character is, so they can assign it some obscure Unicode character that's never getting used?

hwgilbert16 commented 9 months ago

The issue you encounters come from the fact that Scholarsome uses semicolons to discriminate between each line of a Quizlet set, denoting a new card.

Issue with Quizlet is that, as far as I can tell, they seem to actively interfere with abilities that allow users to export their sets to other platforms. The initial plan for the import feature was to scrape the HTML of the page, but they block non-humans from viewing their site. I'm stuck with using their export to text feature, which is only accessible to the owner of a set, and it exports to an unwieldy large text box if the set is large enough, whereas Anki exporting puts the cards in a neat text file. You also loose any pictures in the set when exporting from Quizlet.

I'll add this as a future feature, as you said, really the best way to import sets from Quizlet is to let the user pick the characters that discriminate between each side of the card and between each card. To keep it simple, I just hardcoded the semicolon and tab separator to keep things simple, otherwise the docs will have to explain to novice users how the import works and could turn people away.

If the set is too large to reasonably hand-import, besides removing the semicolons, you can try importing the Quizlet set to Anki and then exporting it from Anki as a .apkg, then importing that into Scholarsome, using a plugin (https://ankiweb.net/shared/info/538351043).

RannyBergamotte commented 9 months ago

Ok so I think I may have found a bug, would you want me to create a separate issue?

Steps to recreate (As far as I know at least)

  1. Add an invalid formatted Quizlet export, which has a semicolon in a definiton (maybe term too, haven't checked)
  2. Attempt to import into Scholarsome. Fails as expected
  3. Fix error in Quizlet and re-export properly this time
  4. Attempt to import into Scholarsome. Improperly fails, even though it is valid
  5. Re-opening the import dialog and re-pasting doesn't fix it
  6. Refreshing (or maybe hard-refresh?) the page fixes it, trying to import the valid Quizlet export works as expected now

I think it would also be nice to have either a more detailed error message or like a highlighter telling you where and how it failed.

RannyBergamotte commented 9 months ago

Also separate enhancement, support the newlines from import would be nice.

Just wanna say, thank you so much for this awesome tool/website, it's great to have some FOSS after Quizlet tried to screw us all over by making everyone pay for Quizlet Plus if they want a usable study experience. Looking very much forward to the future of this project! I'm not great at coding, especially with typescript as most of my experience comes around from making stuff in Motion Canvas, but if there is anything I can do to contribute, I'd be happy to try to help in the upcoming winter break!

hwgilbert16 commented 9 months ago

No problem! I'm always glad to hear that Scholarsome is being used for its intended purpose and that it's helping out in one way or another.

I'd be happy to try to help in the upcoming winter break!

Sure, feel free to send me a message on the Discord server if you'd like to contribute in some way. I'm sure there's something that can be found that's in your wheelhouse.

hwgilbert16 commented 9 months ago

Newline support and user-configurable discriminator added in 823efde4454147bc1b1a3497623feb8d2cf98e80