Open samwaseda opened 9 months ago
I made a complete example:
from pathlib import Path
import bz2
# No guarantee that this one works - copied from ChatGPT
def read_file_inside_bz2(compressed_file_path, file_inside_bz2):
with bz2.open(compressed_file_path, 'rt') as bz2_file:
# Iterate through the lines in the bz2 file
for line in bz2_file:
# Check if the line contains the file_inside_bz2
if line.strip() == file_inside_bz2:
# Read the content of the file_inside_bz2
content = next(bz2_file).strip()
return content
class Parser:
def __init__(self, content):
self.content = content
def get_energy(self):
# parsing using self.content
return energy
def get_forces(self):
# parsing using self.content
return forces
# Step No. 1 in my list above
def file_to_content(file_name, file_path):
if Path(file_path).suffix.lower() == "bz2":
return read_file_inside_bz2(file_path, file_name)
else:
with open(Path(file_path) / Path(file_name), "r") as f:
return f.read()
# Step No. 2 in my list above
def content_to_functions(content):
parser = Parser(content)
return {
"energy": parser.get_energy,
"forces": parser.get_forces
}
# Step No. 3 in my list above
def functions_to_data(data_functions):
data = {tag: func() for tag, func in data_functions.items()}
Today in the pyiron Q&A session we realized the need for a parser to have multiple layers:
In today's discussion, we became aware of the need for the first step, mainly because we would like to be able to load the content of a file after its compression. Currently the loading takes place inside individual parsers with no standard method (
np.loadtxt
,with open
etc.), which may or may not work for compressed files. Instead of doing so, we create something likeFileObject
, which can take both a plain file and a compressed file, and convert it into plain text.The last layer came into question in a discussion that I had with @jan-janssen, basically following the idea that has already been implemented in interactive jobs, to make it possible for the user to choose the input and output data that they would like to use.