Closed cfgamboa closed 1 year ago
Hi Carlos,
I am currently deep into solving a set of issues which are rather urgent in an area completely unrelated to xrootd, so I cannot address this quickly. I will try to get to it as soon as I can.
Al
Thank you Al.
If you could in the meantime actually open an RT ticket for this with
full client -d3 output
Thanks.
Al
Hi Carlos,
real quick question: do webdav and GFTP work in the same situation?
Dmitry
No webdav works fine.
Carlos, still working on these other unrelated issues, so I wanted to tell you I will probably not have time to look at this until next week.
In the meantime: if you could collect that client debug log I asked for, that would be helpful.
Also, when you say you are using 8.2.4, did you ever use a prior version of 8.2 (like 8.2.1?) which worked for this? I'm surprised you are discovering this now when I thought the initial testing was positive. Or did you switch network configurations? Just want to know if this worked before, or whether it has never worked.
Hello, The use case here is for a pool that do not support IPV6. Is there any change in from 8.2.2 on 8.2.4 on dCache that could contribute on this ?
All the best, Carlos
Certainly no change to xroot. Whether some other change remains to be seen. Did this work for 8.2.2?
Al
The test and deployment of 8.2.2 was for a dCache instance with different use case. There the pools do have dual IPV4/IPV6 stack.
What did you set
xrootd.net.internal=
to?
21 Nov 2022 14:32:53 (Xrootd-dcqos002-proxy) [door:Xrootd-dcqos002-proxy@xrootd-dcqos002Domain:AAXuAB76Cfg dcqos005_4 DoorTransferFinished 000055A20830F08D410CABBBA29EC974CC27] Transfer 000055A20830F08D410CABBBA29EC974CC27@PoolName=dcqos005_4 PoolAddress=dcqos005_4@dcqos005fourDomain failed: General problem: Unable to find address that faces lxplus732.cern.ch/2001:1458:d00:1:0:0:100:42f (error code=666)
I'm confused. This looks like the poolManager is trying to match the pool to the client. The client should not be connecting to the pool. The choice of the pool should be on the basis of the door, not the client. I am rather surprised this is happening.
It really would be helpful if I could see the client logs and also your door configuration.
Thanks.
Al
Please take at look at the configuration posted before
[xrootd-${host.name}Domain]
[xrootd-${host.name}Domain/xrootd]
xrootd.cell.name=Xrootd-${host.name}-proxy
xrootd.net.port=1096
xrootd.net.proxy-transfers=true
xrootd.net.internal=10.42.38.49
Ah yes, sorry, lost in the delay.
The pool is running on IPV4
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 10.42.64.26 netmask 255.255.254.0 broadcast 10.42.65.255
ether b8:59:9f:3a:38:34 txqueuelen 1000 (Ethernet)
RX packets 4177676 bytes 1167537379 (1.0 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8953448 bytes 8660230360 (8.0 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
The door interfaces are, it appears that the door is using the IPV6 component to interact to the internal pool on IPV4
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 192.12.15.15 netmask 255.255.255.0 broadcast 192.12.15.255
inet6 fe80::9a03:9bff:fe89:e1fe prefixlen 64 scopeid 0x20<link>
inet6 2620:0:210:1::f prefixlen 64 scopeid 0x0<global>
ether 98:03:9b:89:e1:fe txqueuelen 1000 (Ethernet)
RX packets 47225761 bytes 4285494676 (3.9 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1326621 bytes 712014232 (679.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 10.42.38.49 netmask 255.255.255.0 broadcast 10.42.38.255
inet6 2620:0:210:8803::49 prefixlen 64 scopeid 0x0<global>
inet6 fe80::9a03:9bff:fe04:736 prefixlen 64 scopeid 0x20<link>
ether 98:03:9b:04:07:36 txqueuelen 1000 (Ethernet)
RX packets 161305582 bytes 212950321177 (198.3 GiB)
RX errors 0 dropped 2 overruns 0 frame 0
TX packets 30543840 bytes 20273395118 (18.8 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Yes I understand that. What I don't understand is why the PoolManager is using the client's address to match the pool.
Maybe there is an edge case here which was neglected and disguised in the case of dual stack.
lxplus732.cern.ch/2001:1458:d00:1:0:0:100:42f
is the client, right?
I think I see what the problem is here. I'll get back to you.
Yes
On Nov 28, 2022, at 10:37 AM, Albert Rossi @.***> wrote:
lxplus732.cern.ch/2001:1458:d00:1:0:0:100:42f is the client, right?
— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/6875#issuecomment-1329312262, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO5EEQ527N3HP4ERBZLWKTGURANCNFSM6AAAAAASG75GKI. You are receiving this because you authored the thread.
Thank you.
On Nov 28, 2022, at 10:40 AM, Albert Rossi @.***> wrote:
I think I see what the problem is here. I'll get back to you.
— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/6875#issuecomment-1329315944, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMOZFFA434ENRFABA5Q3WKTG67ANCNFSM6AAAAAASG75GKI. You are receiving this because you authored the thread.
Carlos,
I have figured out what the issue is. Let me explain it to you so you can follow why I will need to discuss this with the group tomorrow before taking action.
When the transfer comes into the door, the door has to do, among other things, the following:
Now, when we added the internal address, it is that address we use to select the pool (1) for the proxy. However, given the current state of the code which is shared across various doors (not just xroot), we decided we should continue passing the original client address for (2) because when the mover is started, billing is updated, and we wanted the billing entry to reflect the actual user/external client, not the proxy/door client.
When the pools are dual stack, this is not a problem. But if the pool does not support IPv6, the mover start fails because it thinks it needs to connect to the client.
There are two possible solutions here:
The second solution is better, of course, but it will entail more changes which are potentially disruptive to other protocols.
Before taking the second solution, however, I would like to get the opinion of the rest of the team.
Can you live with the delay, or do you prefer a quick fix (which will scramble your billing records) and then just update to the better fix when it is provided (which undoubtedly would mean passing over to the 9.0+ version)?
Cheers, Al
Carlos,
The team consensus is that we go with the first solution; it is not crucial that the billing record reflect the original client IP; that data can be obtained from the door record, which can be associated with the billing record through door.transaction=billing.initiator.
I will be posting a patch soon.
Al
Dear all,
It seems that XROOT proxy mode is not redirecting IPV6 Client requests via xrootd.net.internal interface.
I am using dCache 8.2.4
If the IPV6 is used to issue the request the transfer fails
The xroot log extract:
The pool sees the external client IP not the proxy door IP, see below:
The transfer should use the internal interface as:
[xrootd-${host.name}Domain] [xrootd-${host.name}Domain/xrootd] xrootd.cell.name=Xrootd-${host.name}-proxy xrootd.net.port=1096 xrootd.net.proxy-transfers=true xrootd.net.internal=10.42.38.49
Could you please advise
All the best, Carlos