tetherless-world / mowgli-etl

DARPA Machine Common Sense (MCS) Multi-modal Open World Grounded Learning and Inference (MOWGLI) Extract-Transform-Load sub-project
MIT License
6 stars 1 forks source link

Webchild #7

Closed rohatd closed 4 years ago

rohatd commented 4 years ago

I think this should properly parse the data from the 4 txt's in ../WebChildData.

gordom6 commented 4 years ago

Put the data in data/ off the root of the repository, in the structure that already exists. If it's over 50M don't check it in, just .gitignore it and assume a copy will be on your system. Subset any larger files and check those in so they can be used in tests.

stouffers commented 4 years ago

@rohatd I am unable to run the webchild pipeline. There is an indentation issue in the pipeline file and then some import issues as well.

To run the pipeline, activate the venv and then python -m mowgli.cli --pipeline-module webchild

rohatd commented 4 years ago

@rohatd I am unable to run the webchild pipeline. There is an indentation issue in the pipeline file and then some import issues as well.

To run the pipeline, activate the venv and then python -m mowgli.cli --pipeline-module webchild

I made the fixes to allow this command to run properly: python -m mowgli.cli --pipeline-module webchild --memberof-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_memberof.txt' --physical-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_physical.txt' --substanceof-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_substanceof.txt' --wordnet-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\WordNetWrapper.txt'

rohatd commented 4 years ago

@rohatd I am unable to run the webchild pipeline. There is an indentation issue in the pipeline file and then some import issues as well. To run the pipeline, activate the venv and then python -m mowgli.cli --pipeline-module webchild

I made the fixes to allow this command to run properly: python -m mowgli.cli --pipeline-module webchild --memberof-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_memberof.txt' --physical-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_physical.txt' --substanceof-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_substanceof.txt' --wordnet-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\WordNetWrapper.txt'

@rohatd I am unable to run the webchild pipeline. There is an indentation issue in the pipeline file and then some import issues as well. To run the pipeline, activate the venv and then python -m mowgli.cli --pipeline-module webchild

I made the fixes to allow this command to run properly: python -m mowgli.cli --pipeline-module webchild --memberof-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_memberof.txt' --physical-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_physical.txt' --substanceof-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\webchild_partof_substanceof.txt' --wordnet-csv-file-path '{path_to_mowgli}\mowgli\data\webchild\WordNetWrapper.txt'

Now all it should need to run now is python -m mowgli.cli --pipeline-module webchild, as I set up the default paths to be sent to the data folder