The current solution with the generic HTML parser works great for sites such as "Ultimate G...". For some sites (such as https://www.chords-and-tabs.net, for example), it does not work so great. I have to clean up the parsed text file quite a bit.
Therefore, I propose the following solution: The app should contain multiple HTML parsers, each for a dedicated site. As a fallback, the current "generic" parser could be used. The page-specific parsers could be unit-tested, ensuring that they work as expected. In addition, this allows for people to request/contribute parsers for the sites that they use without impacting the "generic" parser that we have right now.
The interface for the parsers could look as follows:
interface WebPageChordParser {
// Provides feedback if the parser supports the given URL.
fun supportsURL(url: String): Boolean
// Attempts to converts the given HTML text to a plaintext document
// which contains just the chords.
fun convertHtmlToText(htmlText: String): String?
// Attempts to determine the BPM of the song from the given HTML.
fun extractBPMFromHtml(htmlText: String): Int?
}
Each parser (the generic one, too) would implement this interface. The WebSearchViewModel could then be provided with a list of these WebPageChordParser object and iterate over each of them, asking it if it supports the given URL. If no parser supports the given URL, the generic parser would take over, as a fallback.
The current solution with the generic HTML parser works great for sites such as "Ultimate G...". For some sites (such as https://www.chords-and-tabs.net, for example), it does not work so great. I have to clean up the parsed text file quite a bit.
Therefore, I propose the following solution: The app should contain multiple HTML parsers, each for a dedicated site. As a fallback, the current "generic" parser could be used. The page-specific parsers could be unit-tested, ensuring that they work as expected. In addition, this allows for people to request/contribute parsers for the sites that they use without impacting the "generic" parser that we have right now.
The interface for the parsers could look as follows:
Each parser (the generic one, too) would implement this interface. The
WebSearchViewModel
could then be provided with a list of theseWebPageChordParser
object and iterate over each of them, asking it if it supports the given URL. If no parser supports the given URL, the generic parser would take over, as a fallback.