Closed galvangoh closed 3 days ago
Yes, there is a way to output the parsing result in JSON using the get_json_result()
method when using LlamaParse
as a file_extractor
together with AzStorageBlobReader
to read PDFs in blob storage. The get_json_result()
method is defined in the DashScopeParse
class, which is part of the LlamaIndex framework.
Here is the relevant code snippet for the get_json_result()
method:
def get_json_result(
self, file_path: Union[List[str], str], extra_info: Optional[dict] = None
) -> List[dict]:
extra_info = {"parse_fmt_type": ResultType.DASHSCOPE_DOCMIND.value}
"""Parse the input path."""
try:
return asyncio.run(self.aget_json(file_path, extra_info))
except RuntimeError as e:
if nest_asyncio_err in str(e):
raise RuntimeError(nest_asyncio_msg)
else:
raise
This method can be used to parse the input file path and return the result in JSON format [1].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
I am currently using
LlamaParse
as afile_extractor
together withAzStorageBlobReader
to read PDFs in blob storage and parse the document at the same time. Currently, this integration only allows.load_data()
that outputs the parsing result either as text or markdown. Is there a way to also output the parsing result in JSON - like same asget_json_result()
as mentioned in this issue from LlamaParse?Thank you.