bitnine-oss / agensgraph

AgensGraph, a transactional graph database based on PostgreSQL
http://www.agensgraph.org
Other
1.34k stars 148 forks source link

questions about agensgraph #256

Closed candysmurf closed 6 years ago

candysmurf commented 7 years ago

Only a couple of questions now and I'll have more questions to follow. thanks.

gtyun commented 7 years ago
candysmurf commented 7 years ago

May I know:

gtyun commented 7 years ago

Thank you for your interest.

candysmurf commented 7 years ago

May I know how I can use your agens browser? Any purchase plan available? What's your roadmap?

joshuayb commented 7 years ago

Hi Candy,

Our R&D Center is in Santa Clara. I think we are 15 min driving distance at most. Please, give me an e-mail and then we can schedule a meeting. (or a video conference)

Thanks.

candysmurf commented 7 years ago

@joshuayb, I saw that you guys will demo this to us today. thanks for your reply.

joshuayb commented 7 years ago

Hi Emily,

Yes, I assumed you might be in the meeting today. It was nice to talk with you all. Feel free to contact me anytime if you have any questions. joshua@bitnine.net

candysmurf commented 7 years ago

May I know what client do you support? Java for sure. Any others?

gtyun commented 7 years ago

Only java is supported for now. Python and Node.js are planned.

candysmurf commented 7 years ago

@gtyun,thanks for your reply. I'll give Java a try.

candysmurf commented 7 years ago

May I install 3 instances on the same machine to form a cluster? Any guideline/example?

gtyun commented 7 years ago

If you want to make a replication cluster, you can refer to https://www.postgresql.org/docs/9.6/static/warm-standby.html https://wiki.postgresql.org/wiki/Streaming_Replication

If each instance has its own PGPORT, it's possible that 3 instance on the same machine. Actually I tested with virtual box. The Following is brief description for hot standby server configuration.

Add to pg_hba.conf host replication ‘username_primary’ ‘ip address_standby’ trust host replication ‘username_standby’ ‘ip address_primary’ trust

Fix postgresql.conf listen_address = '*' wal_level = 'hot_standby' max_wal_senders = 4 wal_keep_segments = 64 hot_standby = on max_replication_slots = 1

Copy master's database cd $AGENS_HOME mkdir -m 700 /data pg_basebackup -h hostip -U username -D data -R

Fix ./data/recovery.conf recovery_target_timeline = 'latest' primary_slot_name = 'rp_slot_name' Add "application_name=slave1" to primary_conninfo

candysmurf commented 6 years ago

@gtyun, thanks for your reply. I was busy doing other stuff. Now I really need to get this going. Where are pg_hba.conf and postgresql.conf files? I don't see them.

Any docker cluster setup do you have? If you have time, I would like to have a work session with you. thanks.

gtyun commented 6 years ago

That conf files are in the directory where you have initialized database byfollowing command. 'initdb -D /path/to/datadir' Or if you set $AGDATA, than conf files will be there after initdb. Unfortunately, docker image is not ready yet.

gtyun commented 6 years ago

@candysmurf , I'm going to build docker image for AgenGraph and @joshuayb will contact to you about the work session

johnberzy-bazinga commented 6 years ago

@gtyun @candysmurf If it's any help I've modified the official postgres docker image a while back to work with agens. I've only been using it locally, so have not vetted it for production use. Also, I've removed the hadoop from the list of extensions as I could not get it to work.

https://gist.github.com/johnberzy-bazinga/b89e39f105b0ac3a962b97ccac200cfd

usage (with no persistent volume) :

docker build -t agensgraph .

docker run --rm -p 5432:5432 -e POSTGRES_DB='testdb' agensgraph:latest

johnberzy-bazinga commented 6 years ago

Also, I've added experimental client support for .NET on top of Npgsql if interested.

https://github.com/johnberzy-bazinga/Npgsql.AgensGraph

candysmurf commented 6 years ago

@johnberzy-bazinga, thanks for your docker image. It's one instance though. I need to build the stream replication agensgraph 3 nodes cluster. Can we turn this to use docker-compose on different ports on the same machine?

Update I tried this image: https://github.com/kimkw0227/agensgraph-docker/tree/master/1.2.1. It worked for me. it's one instance.

My problem currently is that I don't see the manual master-slave replication work. If it works, I would love to have such an image for 3 nodes cluster. thanks!

candysmurf commented 6 years ago

@gtyun, I'm trying to setup two nodes first on the same machine using different ports. As they're on the same machine with different ports. Do I stil need to add the following:

Add to pg_hba.conf host replication ‘username_primary’ ‘ip address_standby’ trust host replication ‘username_standby’ ‘ip address_primary’ trust

Leaving ip address empty? As I didn't setup a username yet, leaving them empty as well?

Update

  1. I added the following in the master pg_hba.conf file. The user replica was created.

    host    replication     replica      10.211.55.15/24            md5
    host    replication     replica      10.211.55.15/24            md5
  2. Modified the master postgresql.conf same as the what @gtyun has above

  3. Modified the slave postgresql.conf same as #2.

  4. Modified data/recovery.conf as following:

standby_mode = 'on'
recovery_target_timeline = 'latest'
primary_slot_name = 'rp_slot_name'
primary_conninfo = 'user=replica password=replicauser@ host=10.211.55.15 port=5433 application_name=slave1 sslmode=disable sslcompression=1'

restarted both master and slave. I didn't see any data got replicated. Anything am I missing?

candysmurf commented 6 years ago

if I execute the following statement, it returned 0 rows.

vagrant@ubuntu1604:~$ agens -p 5433 test
agens (AgensGraph 1.2.0, based on PostgreSQL 9.6.2)
Type "help" for help.

test=# select * from pg_stat_replication;
 pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_location | write_location | flush_loc
ation | replay_location | sync_priority | sync_state
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+--------------+-------+---------------+----------------+----------
------+-----------------+---------------+------------
(0 rows)

Is there anything missing?

gtyun commented 6 years ago

my pg_hba.conf is host replication ygt 127.0.0.1/32 trust

'replica' is your agensgraph user name? And you don't need to write twice. But it does not matter anyway.

And in master session test=# SELECT * FROM pg_create_physical_replication_slot('rp_slot_name');

Everything else looks fine.

My result is ... [ygt@ygtpc agens-graph]$ ps x | grep postg 3603 pts/2 S 0:00 /home/ygt/work/git/agens-ygt/AgensGraph/bin/postgres 3611 ? Ss 0:00 postgres: checkpointer process
3612 ? Ss 0:00 postgres: writer process
3613 ? Ss 0:00 postgres: wal writer process
3614 ? Ss 0:00 postgres: autovacuum launcher process
3615 ? Ss 0:00 postgres: stats collector process
3993 pts/2 S 0:00 /home/ygt/work/git/agens-ygt/AgensGraph/bin/postgres -D old -p 5433 3994 ? Ss 0:00 postgres: startup process recovering 000000010000000000000012 3995 ? Ss 0:00 postgres: wal receiver process streaming 0/12000060 3996 ? Ss 0:00 postgres: wal sender process ygt 127.0.0.1(40360) streaming 0/12000060 3998 ? Ss 0:00 postgres: checkpointer process
3999 ? Ss 0:00 postgres: writer process
4000 ? Ss 0:00 postgres: stats collector process
4009 pts/2 S+ 0:00 grep --color=auto postg [ygt@ygtpc agens-graph]$ agens agens (AgensGraph 1.2.0, based on PostgreSQL 9.6.2) Type "help" for help.

ygt=# select * from pg_stat_replication ; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_location | write_location | flus h_location | replay_location | sync_priority | sync_state ------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+--------------+-----------+---------------+----------------+----- -----------+-----------------+---------------+------------ 3996 | 10 | ygt | slave1 | 127.0.0.1 | | 40360 | 2017-11-02 18:28:34.072319+09 | | streaming | 0/12000060 | 0/12000060 | 0/12 000060 | 0/12000060 | 0 | async (1 row)

candysmurf commented 6 years ago

@gtyun, yes, replica is the replication user created. after I change md5 to trust, when doing select, I got

test=# SELECT * FROM pg_create_physical_replication_slot('rp_slot_name');
ERROR:  replication slot "rp_slot_name" already exists

Also my output is different than yours

vagrant@ubuntu1604:~$ ps x | grep postg
13148 pts/0    S      0:00 /share/agensgraph/bin/postgres -D db_cluster
13150 ?        Ss     0:00 postgres: checkpointer process
13151 ?        Ss     0:00 postgres: writer process
13152 ?        Ss     0:00 postgres: wal writer process
13153 ?        Ss     0:00 postgres: autovacuum launcher process
13154 ?        Ss     0:00 postgres: stats collector process
13158 pts/0    S      0:00 /share/agensgraph/bin/postgres -D db_cluster_1
13160 ?        Ss     0:00 postgres: checkpointer process
13161 ?        Ss     0:00 postgres: writer process
13162 ?        Ss     0:00 postgres: wal writer process
13163 ?        Ss     0:00 postgres: autovacuum launcher process
13164 ?        Ss     0:00 postgres: stats collector process
13173 pts/0    S+     0:00 grep --color=auto postg

Maybe something is still missing?

candysmurf commented 6 years ago

Note that as I have two instances on the same server, i have to login the master using this command:

agnes -p 5433 test

test is one of DB in the master instance.

candysmurf commented 6 years ago

@gtyun, what command did you use to init two instances on the same machine? I'm wondering if there are differences between yours and mine. I had to manually change the port number in both postgreps.conf.

gtyun commented 6 years ago

That ps x output shows that your both sessions are master. The recovery.conf must be in slave's db_cluster to start up as slave.

command for my master session ag_ctl start -w -D ./data -o "-p 5432"

command for my slave session ag_ctl start -w -D ./old -o "-p 5433"

both username is ygt and dbname is ygt

candysmurf commented 6 years ago

@gtyun, I created data directory inside db_cluster_1 which is slave. The recovery.conf generated after running pg_backup.

Does it require the dbname and username to be same?

Maybe I should try your -o command option to see if it makes any difference.

Also did you modify slave postgres.conf with the same setting as the master?

gtyun commented 6 years ago

dbname and username do not have to be the same. And postgres.conf has been replicated and not modified.

candysmurf commented 6 years ago

@gtyun, mine two instances are like master/master. I don't know what I'm missing. May I have a meeting with you tonight?

candysmurf commented 6 years ago

@gtyun, where is agensgraph data directory? isn't it the one that was in the path when I initiated a DB? I initiated two dbs, one for master and another for slave. the data directory should created inside the slave db, e.g. db_cluster_1 in my case. Isn't it?

gtyun commented 6 years ago

@candysmurf The master's data directory is initiated by 'initdb' command. It's the path '$AgensHome/data' in my case. And slave's data directory is created by 'mkdir' and filled by 'pg_basebackup'. It's '$AgensHome/old' in my case.

candysmurf commented 6 years ago

@gtyun, nevermind. I made it work now. The part I did differently is that I didn't mkdir data directory and also I had to use replica as the wal_level, such as wal_level=replica in the master.

I bet that I can use the same way to create as many slaves as I like.

It's confusing to me as I didn't have postgres background. Anyway, thank you.

gtyun commented 6 years ago

@candysmurf Sounds great. Good work.

candysmurf commented 6 years ago

@gtyun, next question is that if the master is down, I would like to promote one of the slaves to be the master then restart master automatically. Anything do you recommend? thanks

gtyun commented 6 years ago

@candysmurf You can promote manually like following command ygt=# ag_ctl promote -D $slave_data ygt=# ag_ctl restart -D $slave_data

If you want to build a high availability cluster with auto failover, yon can reference https://clusterlabs.github.io/PAF/ .

Building HA cluster is much more complicated. It was painful for me to do that...

candysmurf commented 6 years ago

@gtyun, ah. Thanks.

If we don't build that, how do you know a master is down and need to promote a slave? When restarting the master, any agnes's command or I need to alter the parameters?

gtyun commented 6 years ago

@candysmurf If master is down, than all kind of connection must be failed. You need to check status of master periodically. PAF used pacemaker for checking. When a slave is promoted to a master, parameter modification is not required. Just the recovery.conf must be removed. If promotion was successful, it will removed automatically. But you must check connection information. Because slave has different ip address or port.

candysmurf commented 6 years ago

@gtyun, any feature is out of the box of AG to check the master? I just don't want to miss anything which is available. Or we'll have to use PAF or...

gtyun commented 6 years ago

@candysmurf If you must build HA cluster, you should just use PAF. Postgresql does not support auto failover by itself.

candysmurf commented 6 years ago

@gtyun, for modeling nested JSON objects, eg.

Object { "name": "a", "stauts": { "state": "on" .... } }

Agensgraph seems only support primitive types plus array. status will have to model as a vertex. Is it right?

gtyun commented 6 years ago

Yes, it is JSONB. Sure JSON and JSONB is almost identical. JSONB supports primitive types and JSONB object and array.

candysmurf commented 6 years ago

I tried to save it as JSONB. I always got an error:

First:

JSONObject jstatus = new JSONObject();
jstatus.put("status", jstatus);

Second

Jsonb statusJson = new JsonbObjectBuilder().add("status", status).build();

Error:

java.lang.IllegalArgumentException: Invalid json value type
    at net.bitnine.agensgraph.util.JsonbUtil.filterValueType(JsonbUtil.java:148)
    at net.bitnine.agensgraph.util.JsonbObjectBuilder.add(JsonbObjectBuilder.java:63)
    at com.javasampleapproach.jdbcpostgresql.dao.impl.FMSVertexDaoImpl.insert(FMSVertexDaoImpl.java:79)
    at com.javasampleapproach.jdbcpostgresql.dao.impl.FMSVertexDaoImpl$$FastClassBySpringCGLIB$$3c333562.invoke(<generated>)
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
    at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
    at com.javasampleapproach.jdbcpostgresql.dao.impl.FMSVertexDaoImpl$$EnhancerBySpringCGLIB$$1b6fdd8.insert(<generated>)
    at com.javasampleapproach.jdbcpostgresql.service.impl.FMSVertexServiceImpl.insert(FMSVertexServiceImpl.java:23)
    at com.javasampleapproach.jdbcpostgresql.SpringJdbcTemplatePostgreSqlApplication.main(SpringJdbcTemplatePostgreSqlApplication.java:47)
candysmurf commented 6 years ago

JSONB seems not searchable. e.g. if I have data like

system[3.4]{"id": "8888", "name": "we", "ctype": "type B", "status": {"health": "good"}}

Cypher query got an error.

match(s:system{status.health: 'good'}) return s;

Please let me know if I miss anything.

gtyun commented 6 years ago

Then this will work well. `match(s:system { status : {health: 'good' }}) return s;

candysmurf commented 6 years ago

@gtyun, thanks for your help, it worked! How to find all nodes that have no relationship?

graph=# start n=node(*) match(n)-[r?]-() where r is null return n;
ERROR:  syntax error at or near "n"
LINE 1: start n=node(*) match(n)-[r?]-() where r is null return n;
gtyun commented 6 years ago

That question is interesting. I didn't think about the vertex without edges.. You can get that vertex like following.

MATCH (a), (b)-[r]-(c) WHERE a != b and a != c RETURN a

yinmingjun commented 6 years ago

This is new issue here: https://github.com/bitnine-oss/agensgraph/issues/288


I try to use like in match-where clause, and I got an error of "Error : ERROR: syntax error at or near "like"".

The cypher I had tested like this: match (a:label_a) where a.name like '%mk%' return a;

How should I do like this? Or another similiar way is also welcomed! @candysmurf

joshuayb commented 6 years ago

Your query is supposed to work. I don't see any syntax error there. I think we need to see the actual screenshot to help you. We may do a quick shared screen call if that helps. Let me send you an e-mail shortly.

yinmingjun commented 6 years ago

@joshuayb You can see the issue here: #288 Thanks.

joshuayb commented 6 years ago

Thank you for sharing the issue.

candysmurf commented 6 years ago

@gtyun, one last question before we close this issue. Everything works well except that merge operation does not detect the duplicates.

For example, I want to save a vertex(node), it's an object which has a property id=10. By using merge, if I save the object with the same id, should it detect it and just update the object instead of creating a duplicate? Please advise if I miss anything. Your timely response is greatly appreciated. We'll have a demo this Friday.

To be more specific, I used this:

MERGE (:person ?), shouldn't it update the person when the person's ID is the same instead of creating a duplicated one? If I did it wrong, please let me know if I need to find first then create?