Closed cristianMarcial closed 1 month ago
Note: corrected the first paragraph. It said "web scrapper" when it should say "parser"
I think this task can be split into two or three tasks.
What do you think?
Data from each column on a single row can now be extracted. Next two steps: sanitize some of the data, as desired data is mixed together or have unnecessary data and needs cleaning; iterate through all rows and save extracted data to dictionaries.
@cristianramos9 thanks for the effort but I've already implemented the parser using your previous code. The tasks can be considered as complete @Ar2691 @michellebou .
@cristianMarcial its missing team leader approval
Also show proof source code of the implemantation that was done.
@Ar2691 Implementation of section catalog parser | Proof
the for loop isn't part of the implementation, it's only there to show that the variable "section_catalog" has the section catalog and can be used by the exporter.py
Objective: Create a parser that can segregate and concatenate the information on a web page obtained by a web scrapper and place it neatly in a file to be imported into our application using a parser.
Description: Using python libraries that extract the source html code of the page to be scraped from a URL, we will tokenize the information extracted from the page about the courses of this semester scraped using a parser, and we will create the code to segregate that information into a separate document which can be read by our application.
Requirements:
Time Constrains: Not after October 8th.
Completion Criteria: Comply with everything listed in the requirements section.
Difficulty: 6, 1 for creating the web scrapper and 5 for being able to create the parser.
Priority: 4