protegeproject / protege

Protege Desktop
http://protege.stanford.edu
Other
971 stars 229 forks source link

Converting files to different formats via the command-line using Protege #1098

Closed Superraptor closed 1 year ago

Superraptor commented 1 year ago

This isn't a bug but rather more of a general query or feature request.

Relevant Software

Background

Essentially, I've been handed a really-not-so-great Turtle file that I don't have much control over. It needs to be able to be converted programmatically to OWL for a subsequent pipeline. It can be read into rdflib, but not owlready2, due to numerous wonky statements. I've tried writing a script to fix all of these issues programmatically, but the script is getting a little aggressive in length (600+ lines).

With that in mind, Protege has no problem opening the file and saving it. That spawned the idea of "hey, why don't I just use Protege for this?" So I began trying to get Protege to work in the script I'm writing.

I've managed to get Protege to open the file from a Python script using os.system, combining approaches discussed here and here; i.e.

protege_run_command = 'PATH_TO_CMD\cmd.exe /c "PATH_TO_PROTEGE\\run.bat PATH_TO_FILE\\file_name.ttl"'
os.system(protege_run_command)

This opens a new Protege Window with file_name.ttl loaded.

Request

It would be a life-saver to be able to save file_name.ttl as file_name.owl from the Python script I have. Looking at the command-line log output Protege outputs when saving, it looks like this:

Copying ontology from temp file (TEMP_FILE_PATH\ontologytempfile) to actual destination (DESTINATION_FILE_PATH\file_name.owl)
Removing temp file: TEMP_FILE_PATH\ontologytempfile
Saved ontology OntologyID(XXX) to file:DESTINATION_FILE_PATH\file_name.owl in RDF/XML Syntax format

While the copy and remove commands are simple enough to implement in a script, I can't wrap my head around how to perform the save command, and even if I did figure it out, I'm not sure if the format would default to something indecipherable. As well, I'm not sure how temporary files are allocated, and so if there are multiple ontologytempfiles with similar names, it may become difficult to determine which is "correct".

Open Questions

With that being said, I just have a couple questions:

  1. Does a command-line save or convert feature exist in any way, even if that way is a bit hack-y? i.e. if you have to call a particular .jar file with certain arguments, that's totally workable.
  2. If the feature does not exist in any way, would there be interest in creating such a feature?

Thank you so much for your time!

matthewhorridge commented 1 year ago

Doing this with Protege sounds a bit like torture. I'm wondering if ROBOT, which uses the same underlying APIs as Protege might be better for your neeeds.

Superraptor commented 1 year ago

I can't believe I didn't know this existed! So many rad features to work with! Thank you so much! I'll include my quick Python implementation for running things here, just in case anyone comes across things in the future:

It still required a little manual finagling for my file since owlready2 is stricter than robot with formatting (which is good, but my file is just of really questionable quality; oh well, it's out of my hands). But it's finally done!

jamesaoverton commented 1 year ago

I'm glad ROBOT was useful @Superraptor. Just a quick note for anyone reading this: your example should read robot convert --input <INPUT_FILE> --format <FORMAT> --output <OUTPUT_FILE>. See the docs at http://robot.obolibrary.org/convert.

Superraptor commented 1 year ago

Yes, you're right, thank you for the correction!