Closed SindhuBairavi closed 7 years ago
Hi!
Current goals of IEPY does not include such a feature of automatic re-preprocess of modified documents
You may add it by defining a django post-save handler for IEDocuments, and defining in there the preprocess steps that need to be re-run. If you have the time and the desire of doing it, go ahead and we can try to guide you.
Some pointers:
pipeline = PreProcessPipeline([
ProcessStepA(override=True),
ProcessStepB(override=True),
....
ProcessStepN(),
], document)
pipeline.process_everything()
Hi, Would this post-save also help in updating the relative tables involved? I can try adding it, but I need help in understanding the current scripts.
Sindhu:
For sure that with those post-save we/you should be able to modify whatever it's needed to be updated.
Then, I'm not sure if I'm understanding what you mean.
Reviewing from your original post, you said "Even if i delete the record and reload the modified content as a new record, the relative tables don't get updated" and I'm wondering... what do you mean? Maybe I didn't read you right.
Can you explain to me what those "relative tables" are? What information do they store, and what changes would you want to apply to them.
On Tue, Nov 15, 2016 at 7:51 AM, Sindhu Bairavi notifications@github.com wrote:
Hi, Would this post-save also help in updating the relative tables involved? I can try adding it, but I need help in understanding the current scripts.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/machinalis/iepy/issues/114#issuecomment-260610021, or mute the thread https://github.com/notifications/unsubscribe-auth/AAd04yMkw3k6olcg7HUb5WLKtJ8n-Ratks5q-Y6ngaJpZM4KveAd .
Javier Mansilla - Technical Leader www.machinalis.com
Hi, What I meant is, if I updated the content in the same record, the other tables like entities and segments and others do not get updated. So those tables retain the original content's data. If I proceed to delete and then reload the record, I will need to manually check the related tables and update/delete those as well. Do let me know if I need to elaborate more.
Ok. As explained, no, iepy is not re running things when content is changed.
But, removing a document and later re-inserting it and later re-running preprocess should do the trick. Segments, EntitiesOccurrences, and those related objects are removed when a document is removed.
How were you deleting Documents?
Oh.. Can you tell me how to remove a document? I've been trying to modify with the database directly using delete statement, which i'm assuming is incorrect.
Can i remove a document from script instead of UI?
Yes, you can remove Documents from the UI.
If you start the webserver
python bin/manage.py runserver
you can later access and remove your documents from here http://127.0.0.1:8000/admin/corpus/iedocument/ (it's your local webserver address)
In the event that the content is updated, only the text field in the db is updated. If i run preprocess again, it is not redo the preprocessing, so text is modified but the preprocess content is not and subsequent tables are not either. Even if i delete the record and reload the modified content as a new record, the relative tables don't get updated. How to handle updated content?