Closed tjohnson314 closed 8 years ago
The new regex looks like it's working.
Now we need to update asv_plain.txt and spanish_plain.txt to reflect the new standard. Can you add that data and push it onto this PR?
Oops, this isn't what I wanted. I updated the files inside the Vagrant box by running the new version of unbound2plain.py, exited, and committed them. But somehow once I'm outside the Vagrant box, the old version of the files is still there.
I'm confused now. I thought the corpora folders were synced? Maybe that doesn't guarantee that the subfolders under them are synced?
When you run unbound2plain.py from the tests folder, it outputs a file called asv_plain.txt.out. Move that to asv_plain.txt. Do the same for spanish. Those will serve as our test standards.
You can do this from outside the Mgiza machine by starting at the root of the git repo and running sh ./mgiza/tests/unbound2plain.sh
-- Briggs
On Jan 20, 2016, at 9:35 PM, tjohnson314 notifications@github.com wrote:
Oops, this isn't what I wanted. I updated the files inside the Vagrant box by running the new version of unbound2plain.py, exited, and committed them. But somehow once I'm outside the Vagrant box, the old version of the files is still there.
I'm confused now. I thought the corpora folders were synced? Maybe that doesn't guarantee that the subfolders under them are synced?
— Reply to this email directly or view it on GitHub.
Now that I think about it, the output from my_tokenizer.py
depends on correct input from unbound2plain.py
, so you'll need to update the following files in /tests/data
:
asv_plain.txt spanish_plain.txt asv_tokenized.txt spanish_tokenized.txt
The log might be messier than it needed to be, but everything should work now, haha.
Okay, @tjohnson314, I hate to do this to you, but here goes: We merged that reeeeally big re-org into development last night. Can you please pull and merge that change into your PR, then push it back? That should get us fixed for good on the testing front.
We should ensure that @tjohnson314 has a commit history though.
Agreed. His works as a hotfix on master and will show up in the commit history, but might not show up in the git blame
once we push dev onto master.
Wow, I didn't know git blame was a thing. That's pretty cool. So what should I do?
I don't think you really need to do anything. You submitted a critically important hotfix to the master branch, which we really needed.
It all depends on how you want to be credited. Your commit will remain in the git history, but the blame will most likely shift to me for that line since I submitted that change in dev after yours on master. It's just a single line, though, and you have lots of other code contributed. Not to mention that you're driving a lot of our algorithm design. I wouldn't sweat credit over a single line in a regex :)
I really hate to do this and it's my mistake, but I fixed this in another PR thinking that this one was targeted at the master branch. Merging it in would require a lot of de-conflicting to just bring us back to where we already are.
I messed up, sorry.
If you're worried about me getting credit, it doesn't really matter to me. That's not why I'm working on this. :-)
Fixes #25