Open shrutimantri opened 7 months ago
When XML file with items are read, the records should be read in ion format without items or item in the ion file. Example: The following XML file:
items
item
<?xml version='1.0' encoding='UTF-8'?> <items> <item> <job_title>BI Data Analyst</job_title> <avg_salary>836644.8</avg_salary> </item> <item> <job_title>ML Engineer</job_title> <avg_salary>679247.63</avg_salary> </item> <item> <job_title>Data Science Manager</job_title> <avg_salary>391371.17</avg_salary> </item> <item> <job_title>Business Data Analyst</job_title> <avg_salary>286000.0</avg_salary> </item> <item> <job_title>Data Scientist</job_title> <avg_salary>257422.32</avg_salary> </item> <item> <job_title>Computer Vision Engineer</job_title> <avg_salary>220583.33</avg_salary> </item> <item> <job_title>AI Scientist</job_title> <avg_salary>193666.67</avg_salary> </item> <item> <job_title>Applied Scientist</job_title> <avg_salary>190614.29</avg_salary> </item> <item> <job_title>Machine Learning Engineer</job_title> <avg_salary>175270.55</avg_salary> </item> <item> <job_title>Research Scientist</job_title> <avg_salary>161292.29</avg_salary> </item> <item> <job_title>Data Architect</job_title> <avg_salary>160283.26</avg_salary> </item> <item> <job_title>Data Engineer</job_title> <avg_salary>157510.03</avg_salary> </item> <item> <job_title>Machine Learning Scientist</job_title> <avg_salary>154638.64</avg_salary> </item> <item> <job_title>Research Engineer</job_title> <avg_salary>146618.11</avg_salary> </item> <item> <job_title>Analytics Engineer</job_title> <avg_salary>142703.15</avg_salary> </item> <item> <job_title>Data Science Consultant</job_title> <avg_salary>141937.5</avg_salary> </item> <item> <job_title>Data Analytics Manager</job_title> <avg_salary>141463.33</avg_salary> </item> <item> <job_title>Machine Learning Infrastructure Engineer</job_title> <avg_salary>141076.36</avg_salary> </item> <item> <job_title>BI Developer</job_title> <avg_salary>129846.15</avg_salary> </item> <item> <job_title>Data Specialist</job_title> <avg_salary>122083.33</avg_salary> </item> <item> <job_title>Data Manager</job_title> <avg_salary>120203.05</avg_salary> </item> <item> <job_title>Data Analyst</job_title> <avg_salary>116348.29</avg_salary> </item> </items>
should be read by XML reader as:
The following XML file:
be read by XML reader as:
{"item":[{"avg_salary":836644.8,"job_title":"BI Data Analyst"},{"avg_salary":679247.63,"job_title":"ML Engineer"},{"avg_salary":391371.17,"job_title":"Data Science Manager"},{"avg_salary":286000,"job_title":"Business Data Analyst"},{"avg_salary":257422.32,"job_title":"Data Scientist"},{"avg_salary":220583.33,"job_title":"Computer Vision Engineer"},{"avg_salary":193666.67,"job_title":"AI Scientist"},{"avg_salary":190614.29,"job_title":"Applied Scientist"},{"avg_salary":175270.55,"job_title":"Machine Learning Engineer"},{"avg_salary":161292.29,"job_title":"Research Scientist"},{"avg_salary":160283.26,"job_title":"Data Architect"},{"avg_salary":157510.03,"job_title":"Data Engineer"},{"avg_salary":154638.64,"job_title":"Machine Learning Scientist"},{"avg_salary":146618.11,"job_title":"Research Engineer"},{"avg_salary":142703.15,"job_title":"Analytics Engineer"},{"avg_salary":141937.5,"job_title":"Data Science Consultant"},{"avg_salary":141463.33,"job_title":"Data Analytics Manager"},{"avg_salary":141076.36,"job_title":"Machine Learning Infrastructure Engineer"},{"avg_salary":129846.15,"job_title":"BI Developer"},{"avg_salary":122083.33,"job_title":"Data Specialist"},{"avg_salary":120203.05,"job_title":"Data Manager"},{"avg_salary":116348.29,"job_title":"Data Analyst"}]}
Run the following flow:
id: xml-writer namespace: company.team description: Analyse data salaries. tasks: - id: download_csv type: io.kestra.plugin.fs.http.Download description: Data Job salaries from 2020 to 2023 (source ai-jobs.net) uri: https://gist.githubusercontent.com/Ben8t/f182c57f4f71f350a54c65501d30687e/raw/940654a8ef6010560a44ad4ff1d7b24c708ebad4/salary-data.csv - id: average_salary_by_position type: io.kestra.plugin.jdbc.duckdb.Query inputFiles: data.csv: "{{ outputs.download_csv.uri }}" sql: | SELECT job_title, ROUND(AVG(salary),2) AS avg_salary FROM read_csv_auto('{{workingDir}}/data.csv', header=True) GROUP BY job_title HAVING COUNT(job_title) > 10 ORDER BY avg_salary DESC; store: true - id: export_result type: "io.kestra.plugin.serdes.xml.XmlWriter" from: "{{ outputs.average_salary_by_position.uri }}" - id: xml_reader type: io.kestra.plugin.serdes.xml.XmlReader from: "{{ outputs.export_result.uri }}"
xml_reader
Expected Behavior
When XML file with items are read, the records should be read in ion format without
items
oritem
in the ion file. Example: The following XML file:should be read by XML reader as:
Actual Behaviour
The following XML file:
be read by XML reader as:
Steps To Reproduce
Run the following flow:
xml_reader
task.Environment Information
Example flow