opencitations / oc_meta

ISC License
8 stars 5 forks source link

jalc_processing + jalc_process + ra_processor modified #16

Closed martasoricetti closed 1 year ago

martasoricetti commented 1 year ago

For what concerns "ra_processor.py", I have added a parameter (with default value None), "citing_entities", that is the path of the folder containing the zipped file produced in the preprocess phase, containing the csv files listing all the citing entities. Furthermore, in order to use the method "load_csv_column_as_set" of the class "CSVManager", I have created a method, "unzip_citing_entities", always of the class "RaProcessor", that takes as input the path of a directory and substitues the zipped file in it with the csvs that it contained. Lastly, I have slightly modified the method "get_pages", just by including the underscore in the accepted characters, because in JALC there are cases in which the pages are in the form "1_36", "1_38", etc.