Open loadko opened 1 year ago
Overview
List of refactors or improvements that can be done to
scraping
folder.List
- [ ] Create a function for this piece of code
const rawValue = $el.find(selector).text() const cleanedValue = cleanText(rawValue) const value = typeOf === 'number' ? Number(cleanedValue) : cleanedValue
Something like
getValueFromElement($, selector, typeOf)
- [ ] Export
getImageFromTeam
frommvp.js
to a util file and use it in scrappersOthers improvements are welcomed.
Its Only and idea
Add support for multiple languages in the scraping application to be able to extract information from web pages in different languages.
To add multi-language support to the scraper, we must first modify the getTopScoresList function to accept a language parameter indicating the language of the web page from which you want to extract data.
Then add an Accept-Language header to the options object that is passed to the fetchAndParse
function to tell the web page server that you want the information in the specified language.
To modify the selectors to match the specified language, we should use a control flow structure such as a switch or a mapping object to assign the correct selectors based on the specified language.
First, modify the getTopScoresList function so that it accepts a language parameter indicating the language of the web page from which you want to extract the information:
export async function getTopScoresList($, language) {
// codigo aqui
}
Then, add an Accept-Language header to the options object that is passed to the fetchAndParse function to tell the web page server that you want the information in the specified language.
const options = {
headers: {
'Accept-Language': language
}
}
const $ = await fetchAndParse(URL, options)
Then create a mapping object that maps the correct selectors based on the specified language:
const languageSelectorsMap = {
en: {
ranking: { selector: '.fs-table-text_1', typeOf: 'string' },
team: { selector: '.fs-table-text_3', typeOf: 'string' },
playerName: { selector: '.fs-table-text_4', typeOf: 'string' },
gamesPlayed: { selector: '.fs-table-text_5', typeOf: 'number' },
goals: { selector: '.fs-table-text_6', typeOf: 'number' }
},
es: {
ranking: { selector: '.fs-table-text_1', typeOf: 'string' },
team: { selector: '.fs-table-text_3', typeOf: 'string' },
playerName: { selector: '.fs-table-text_4', typeOf: 'string' },
gamesPlayed: { selector: '.fs-table-text_5', typeOf: 'number' },
goals: { selector: '.fs-table-text_6', typeOf: 'number' }
}
// Agrega más idiomas aquí
}
Then we can use the mapping object to assign the correct selectors based on the specified language:
const modifiedSelectors = languageSelectorsMap[language] || SCORES_SELECTORS
Finally, use the modified selectors to extract the information from the web page as usual:
const scoresSelectorEntries = Object.entries(modifiedSelectors)
const topScorerList = []
$rows.each((index, el) => {
const topScorerEntries = scoresSelectorEntries.map(([key, { selector, typeOf }]) => {
const rawValue = $(el).find(selector).
Overview
List of refactors or improvements that can be done to
scraping
folder.List
[ ] Create a function for this piece of code
Something like
getValueFromElement($, selector, typeOf)
[ ] Export
getImageFromTeam
frommvp.js
to a util file and use it in scrappersOthers improvements are welcomed.