Closed reckart closed 7 years ago
There have been quite some people working on DKPro Core under Windows and nobody ever
complained about this. What SVN client do you use?
Original issue reported on code.google.com by richard.eckart
on 2014-07-02 04:47:15
1.8.8
Original issue reported on code.google.com by pedrobssantos
on 2014-07-02 07:43:05
Which client? TortoiseSVN, Eclipse (pure java or JavaHL?), svn command line, ...?
Original issue reported on code.google.com by richard.eckart
on 2014-07-02 07:44:23
I just tried to create a file called prn.java and it was not possible.
http://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words
Original issue reported on code.google.com by pedrobssantos
on 2014-07-02 07:45:15
Command line. But I really don't think that is the case. Anyway, I can post on the developers
list asking if someone also have the same the problem if you want to.
Original issue reported on code.google.com by pedrobssantos
on 2014-07-02 07:59:20
And the same problem occurs with tortoise, subclipse, and etc.
Original issue reported on code.google.com by pedrobssantos
on 2014-07-02 08:00:14
I can confirm this error (just decided to also use Windows for development from now
on - due to a separate reason)
Checkout error; svn: Cannot create new file 'C:\Users\Eckle-Kohler\workspace\maven.1404288938825\de.tudarmstadt.ukp.dkpro.core.api.syntax-asl\src\main\java\de\tudarmstadt\ukp\dkpro\core\api\syntax\type\constituent\.svn\tmp\text-base\PRN.java.svn-base'
(Windows 7, checkout in eclipse via Subclipse integration)
Original issue reported on code.google.com by eckle.kohler
on 2014-07-02 08:22:50
If we rename this class, there is a risk of breakage because existing CASes may no longer
be readable, e.g. in CSniper. We can add this to the list of things to change when
we restructure the type system (actually I think it covered by issue 79), but before
that, I'd prefer not to touch it. Until then, you can check out everything on Windows
except the syntax module - which you would have to draw in as a Maven dependency. If
the PRN.class is in a JAR, it should not be a problem.
Original issue reported on code.google.com by richard.eckart
on 2014-07-02 08:28:25
(No text was entered with this change)
Original issue reported on code.google.com by richard.eckart
on 2014-08-06 20:19:04
this issue appears not to exist anymore
just checked out trunk of both DKPro Core asl and gpl under Windows 7 - everything
checks out and builds fine
performed checkout in eclipse (Luna Release (4.4.0)) via Subclipse integration
Original issue reported on code.google.com by eckle.kohler
on 2014-11-19 09:23:29
issue persists - must have been a fata morgana when I tried it last time
Original issue reported on code.google.com by eckle.kohler
on 2014-11-24 19:05:32
Would it be possible to duplicate the PRN
class (giving the duplicate a different name such as, say, PRNx
), update all references to PRN
to PRNx
, mark the original as @deprecated
, and then delete the original PRN
two releases from now? This will give maintainers of other programs advance warning of the new class name and time to update their code.
A potential problem with changing the name of the class is that there are an awful lot of string references to "PRN"
in the code. If any of these strings are dynamically mapped to class names, we'll need to handle "PRNx"
-> PRN
as a special case.
It would be nice to solve this problem as part of migrating to some universal phrase tag set (consituents) #268.
@pkluegl :)
This is not only PRN I think, but also AUX?
https://en.wikipedia.org/wiki/Filename
In addition, in Windows and DOS utilities, some words are also reserved and cannot be used as filenames.[17] For example, DOS device files:[19]
CON, PRN, AUX, CLOCK$, NUL COM1, COM2, COM3, COM4 LPT1, LPT2, LPT3, LPT4 (LPT4 only in some versions of DR-DOS) LST (only in 86-DOS and DOS 1.xx) KEYBD$, SCREEN$ (only in multitasking MS-DOS 4.0) $IDLE$ (only in Concurrent DOS 386, Multiuser DOS and DR DOS 5.0 and higher) CONFIG$ (only in MS-DOS 7.0-8.0)
Grmpf.... Windows....
AUX is a UD name... http://universaldependencies.org/u/pos/AUX_.html
If we change that, we loose the 1-to-1 mapping with the UD that we wanted to have as part of https://github.com/dkpro/dkpro-core/issues/76 ...
One option would be to prefix all the POS types with e.g. UD_
...
Any suggestions/comments? @logological @pkluegl @zesch @Horsmann @judithek @carschno @dkpro/core-dev
I think it makes sense to show which standard we are using for our POS. UD
seems to become a more or less well known abbreviation - which should make it clear that we use the universal dependencies
Maybe a suffix
instead of a prefix
?
I would prefer a prefix over a suffix (but my opinion should not be weighted too high as I am not an active DKPro Core user). We use the type of annotation as prefix, not the standard, e.g., the type would be something like POSTagAux
@pkluegl aren't such verbose names inconvenient when writing Ruta rules?
Yes, they are :-) This was not a proposal, just sharing alternatives. On the other hand, same short names for different packages (as in DKPro) is also not optimal for Ruta.
Well, the constituents tags need to be cleaned up anyway. I don't know if there is an overlap between UD dependencies and UD pos tags.
So just to throw a few additional naming schemes here:
AUX
(this is the one we want to get rid of because of conflicts)UD_AUX
AUX_UD
POS_AUX
P_AUX
pAUX
$AUX
( I wouldn't be surprised if that would cause problems ;) )$AUX
: ahhh... not good. After all problems there might be, at least Ruta won't work with this.
pAUX
: I'd rather stay with the convention that types start with a upper-case letter
P_AUX
: P is not a good abbreviation for pos in my opinion, not really a big advantage over POS_AUX
POS_AUX
, UD_AUX
, AUX_UD
would be ok I think, preference in this order (descreasing).
Is the capitalization important for you? Camel-case is something better for auto-completion.
(Just wanted to mention that Ruta can handle the same short names, but you need aliasing or something when importing the typesystem packages.)
Regarding capitalization: as mentioned before, we wanted to be in sync with the UD POS tags, and they are all capitalized: http://universaldependencies.org/u/pos/index.html
Ah ok, I thought they would also be identifiable if the case was changed for some chars :-D nevermind
Well, in principle yes - I would like to keep the extract logic going from Windows-safe-names to UD names as minimal as possible. Removing/adding an affix is an unambiguous substring operation. Changing the case is not necessarily unambiguous and is simple yet another operation that I'd prefer to avoid.
@reckart there are overlaps between the universal POS tags and the universal dependencies, e.g. POS: AUX, see http://universaldependencies.org/u/pos/AUX_.html (here the mouseover text says: u-pos AUX) dependency: aux see http://universaldependencies.org/u/dep/aux_.html (the mouseover text says: u-dep aux)
another example: conj
I think there is no all-around best solution, though if an affix is to be introduced, it's probably better to be a prefix.
Moin moin, are there plans to address this issue before 1.9.0 is released? Just curious, because this was mentioned to me by a co-worker and I told him I'd ask. For some reason, our dev machines are all Windows (but usually have a Linux VM on them, so we could somehow make it work).
@ferschke Care to contribute?
I could look into this, but from the discussion it seems there are still quite a couple of decisions to be made how to do this.
So I tend towards using POS_AUX as a schema for the type names then.
We have a Jenkins job now that tries compiling DKPro Core on Windows. Of course it fails, but with the new job, we have a better way to noticing these failures and fixing them.
I'll take a look at the types.
Ok, I'll have several questions.
Here's the first one: Should the tweet POS tags also get a POS prefix?
I guess so.
I suppose we'd end with prefixes for any type of POS, Dependency and Constituent subclasses like:
Dear Richard, Is this issue regarding build-failure on windows (10)? I just tried to build 1.9 but I get the following error:
[ERROR] Failed to execute goal org.apache.uima:jcasgen-maven-plugin:2.10.0:generate (default) on project de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl: JCasGen: Input/Output exception occurred. [ERROR] Exception: D:\Workspaces\GitRepository\dkpro-core\dkpro-core-api-lexmorph-asl\target\generated-sources\jcasgen\de\tudarmstadt\ukp\dkpro\core\api\lexmorph\type\pos\AUX.java (The system cannot find the file specified)
Regards, Reza
@rezahay , yes this problem is covered by this issue.
I renamed the POS types now and fixed most of the tests. There are still many other test failures on my machine which could cover some missing renamings. My pull request waits for the ICLA answer.
What about the types for the syntax annotations? The correspnding artifact is building without problems since the customized (with PRN) is excluded.
I mean is it worth renaming AUX0 to DEP_AUX when it means we are renaming about 58-87 types, and there is no urgent need (as of we can build with windows)
Well... do you want to be able to do a full build on windows or not? Do you think it would be sufficient to build only the non-type modules?
... of course a full build. The thing is: I can build api.syntax-asl
There are some other modules like fs.hdfs-asl which could cause some troubles for windows but I would not cover them here.
You can build api.syntax-asl because somebody back in the day already hit the problem that AUX was a reserved name on Windows and changed it to AUX0. This was a long time ago.
Then the Universal POS tags and Universal Dependencies came into existence and we had a few discussions around switching from "DKPro Core specific" coarse-grained tags to commonly accepted coarse grained tags:
This brought back the problem that tag names clashed with Windows device names.
Then we had some discussion around how to resolve such clashes. I was in favor of doing it in a way that affects all tags equally, e.g. adding a prefix or suffix. The benefit would be that we could arrive at the proper coarse-grained tag simply by stripping the affix - and it would work in the same for all tags - no special treatment for individual tags.
Am I understanding correctly that you would be in favor of an approach that would introduce exceptions only for specific tags?
Nope, it depends on the perspective:
Afterall, its about when 1.9.0 should be released and how much time is available to do the changes.
What is the status of #1088, #516 and #517, and what is there impact on this issue?
api.sytnax
module appears to be building on Windows. Their relation to the present issue is that addressing #517 and #516 are also kind of introducing a "breaking" type system change.Btw. the present issue is introducing a "more breaking" change because we actually have to delete the POS types with the names that conflict with Windows device names. Thus, by default DKPro Core 1.9.0 would no longer be able to load XMIs annotated with 1.8.0 POS type names.
While looking at #1090, I noticed that the coarseValue
of the POS
annotation is now often (if not always) set to POS_something
. However, the coarseValue
should not include the POS_
prefix. That's something that needs to be changed everywhere where the coarseValue
is set.
Usually we currently use this code:
posAnno.setCoarseValue(posAnno.getClass().equals(POS.class) ? null
: posAnno.getType().getShortName().intern());
That needs to be changed to strip the POS_
prefix.
Original issue reported on code.google.com by
pedrobssantos
on 2014-07-01 21:56:01