Closed andrewtavis closed 1 month ago
We also should discuss what the default export directory name is š¤ I think there's value in branding it as it'll also make it easier for the lay user to find? scribe_data_export
? If so, we should rename language_data_export
:)
Yea, we can rename it to scribe_data_export
.
Hey @mhmohona š FYI I'm realizing that it'd make sense to have file type based names for the export directories, as ultimately the JSON directory will still need to exist while sqlite directories are being created. Also makes the return and distinguishing between them a bit more easy. As seen in the project root, we now have scribe_data_json_export
and scribe_data_sqlite_export
. Let me know if you think this makes sense! We can also change it back later š
So I have worked on this issue, and its how getting the output -
Is it okay? How can I make an improvement on it?
We can help on the output for this, @mhmohona, but as said on Matrix, let's include the length of the file :) Check the line in the utils for how the original output of update_data.py
looks as well š
For this one, @mhmohona, we need to convert Scribe-Data's CLI query
command over to using update_data.py
. Maybe we can do a call on this at some point to plan this out a bit better? I think that we have the basics of what's needed here, but remember that the goal is that we're brining down new data, not moving the data from scribe_data_json_export as this file will eventually be removed š
Thinking about this further, @mhmohona, as of now scribe_data/wikidata/update_data.py is just a script that's ran via the command line. What likely needs to happen is that we need to put the code for that file into a function that we can then import into the scribe_data/cli/query.py file :) We won't be running update_data.py
directly anymore, so this should work really well š
@andrewtavis,does this satisfy the requirement?
Looking really great, @mhmohona! I think that we're ready for a PR :)
I have pushed my changes on #163 as I forgot to switch branch before working. š
Closed by #163 š Thanks for all the work here, @mhmohona! āļø
Terms
Description
This issue would add in the
--output-dir
(-od
) and--overwrite
(-o
) functionality to the Scribe-Data CLI. This will allow the user to specify a directory where the results of the--query
command will be written. Note that I renamed this--output-dir
from the discussed--output-file
as if there's more than one output file, then the string provided will need to be to a directory. So as to avoid checks for if we're returning a file or a directory and issues that that would cause for the end users, let's always return a directory. Things this argument will do:.json
, etc)--overwrite
should be checkedfalse
, check if each file that already exists should be rewritten and skip queries if the user says no viainput()
true
, then overwrite all filesContribution
@mhmohona will be working on this as a part of GSoC 2024 āļø Please write in here if you would, and let us know if you need some support! š