Closed jmr closed 4 years ago
I agree with that std::string_view
version getters and setters are convenient.
However, const char *
version setters should not be replaced.
If the argument type changes depending on __cplusplus
, users will be confused.
For example, if libmarisa is built on C++17 or later, callers must be built on C++17 or later. If there is a conflict, the behavior is undefined.
I agree with that
std::string_view
version getters and setters are convenient. However,const char *
version setters should not be replaced. If the argument type changes depending on__cplusplus
, users will be confused.
There shouldn't be any confusion. You can still pass a const char *
in C++17 mode, since string_view
can be implicitly constructed from const char *
. I didn't remove any test cases, so you can verify that this still works.
Maybe I should add a comment like, "Don't worry, you can still pass const char *
"?
For example, if libmarisa is built on C++17 or later, callers must be built on C++17 or later. If there is a conflict, the behavior is undefined.
This is a good point. Shall I make it a configure option?
How about adding accessors with new names, such as set_str_view()
?
We can easily avoid conflict problems.
How about adding accessors with new names, such as
set_str_view()
?
To me, it seems nicer to have the set_str
overloads. What's the advantage of having both set_str(const char *)
and set_str_view(string_view)
? Just set_str(string_view)
will do everything and have slightly smaller binary size.
If I understand correctly, C++ doesn't have that kind of stdlib ABI compatibility. If Marisa was linked against C++17 stdlib, it is not guaranteed to work with C++11 stdlib even if new C++17 stuff isn't used. Am I wrong?
If we make MARISA_USE_STRING_VIEW
a config option it will work and save a lot of thinking.
The same version of gcc with different -std=
options are compatible: https://stackoverflow.com/a/49119902
I think if we did something like
#if __cplusplus >= 201703L
void set_str(string_view);
#endif
void set_str(const char *);
void set_str(const char *, size_t);
it will even work with both versions, since we're just adding a non-virtual function and not changing the representation. I'm not really sure, since I haven't thought about it enough, though.
If Agent::set_query(const char *)
is inline, I could accept this pull request without worry.
If libmarisa is built on C++17 or later, Agent::set_query(const char *)
does not exist.
In this case, users must use C++17 or later, because Agent::set_query(const char *)
is only declared.
I believe it will confuse users.
If
Agent::set_query(const char *)
is inline, I could accept this pull request without worry.
Why is it better if it's inline?
If libmarisa is built on C++17 or later,
Agent::set_query(const char *)
does not exist. In this case, users must use C++17 or later, becauseAgent::set_query(const char *)
is only declared. I believe it will confuse users.
Hmm. I fixed this, but now if the user build libmarisa
with C++14 or earlier, trying to use set_str(string_view)
(with -std=c++17
or equivalent, of course) will just result in a confusing link error (I think).
Maybe MARISA_USE_STRING_VIEW
is really best?
Why is it better if it's inline?
I'm sorry that I could not explain it well.
I would like to show you an example.
#ifndef ECHO_H
#define ECHO_H
#include <string_view>
class Echo {
public:
#if __cplusplus >= 201703L
std::string_view echo(std::string_view s) {
return s;
}
#else // __cplusplus >= 201703L
const char *echo(const char *s);
#endif
};
#endif // ECHO_H
#include "echo.h"
#if __cplusplus < 201703L
const char *Echo::echo(const char *s) {
return s;
}
#endif // __cplusplus < 201703L
// main.cc
#include <iostream>
#include "echo.h"
int main() {
Echo echo;
std::cout << echo.echo("Mike") << std::endl;
return 0;
}
If echo.cc is compiled with C++17 and main.cc is compiled with C++14, the linker fails due to undefined reference as follows:
$ g++ -c -std=c++17 echo.cc
$ g++ -c -std=c++14 main.cc
$ g++ echo.o main.o
main.o: In function `main':
main.cc:(.text+0x26): undefined reference to `Echo::echo(char const*)'
collect2: error: ld returned 1 exit status
This is because Echo::echo(const char *)
does not exist in echo.o.
If the definition of Echo::echo(const char *)
is provided in echo.h (inline), this problem does not happen.
Maybe MARISA_USE_STRING_VIEW is really best?
I think it's not a good way.
That makes sense, thanks.
2139f49 no longer has that problem, since I only add functions, not delete them, but I'll do something like your echo but with inline.
See if #31 is ok, then I'll do the inline version as you suggest.
I think the current version of this PR (without #31 ) works, with a tiny disadvantage of larger binary size. It's equivalent to this version of echo:
echo.h
#include <string_view>
class Echo {
public:
#if __cplusplus >= 201703L
std::string_view echo(std::string_view s) {
return s;
}
#endif
const char *echo(const char *s);
};
echo.cc
#include "echo.h"
const char *Echo::echo(const char *s) {
return s;
}
$ g++ -c -std=c++17 echo.cc && g++ -c -std=c++14 main.cc && g++ echo.o main.o && ./a.out
Mike
Can we proceed with this and decide whether/how to make Agent::set_query
inline later?
Sorry for my late response.
It looks good to me. Thank you!
In
Agent
,Key
, andQuery
; add functions to set the string using astd::string_view
. Remove theconst char *
version.Also add getters:
These are convenient because
becomes
Setting with a
const char *
still works, sincestring_view
has an implicitconst char *
constructor.Similarly,
becomes
std::string_view
is a C++17 feature, so all changes are guarded with#if __cplusplus >= 201703L
.