scieloorg / document-store-migracao

Document Store (Kernel) - Migração
BSD 2-Clause "Simplified" License
1 stars 6 forks source link

Artigos sem ISSN no Kernel #411

Open jamilatta opened 3 years ago

jamilatta commented 3 years ago

Descrição do problema

Verificando o motivo pelo qual o seguinte artigo não está no site, pós migração identifiquei que existe artigos que estão no kernel sem o ISSN no XML e consequentemente ausente na formação dos seu ativos.

Artigo: https://new.scielo.br/article/S1517-31512013000300006

Passos para reproduzir o problema

Exemplo de erro: https://kernel.scielo.br/documents/C6mQBy6rj57xFSYWLTmntxt

Olhando o minio é possível verificar mais alguns artigo com a mesma situação, veja: https://minio.scielo.br/minio/documentstore/None/

Comportamento esperado

Deve conter o ISSN.

jamilatta commented 3 years ago

Lista de pids V3 nessa situação:

3BTLDWzmHmHfqGwtZjRFhNS
3Rq5B7HzHhcV9XPgHJhRsZF
4T4Fty7pkvxz4GDcgrc3FxQ
4TpKw6gSsVnLjn4j3qCVcmh
4ggsxNmqdGWMM7bqyLfhN4J
4q9wmwtZfMr73fcPRmCXQDk
54gKs9dnzcZkTnZhGdGthpm
55RhWmGGSzvQ6s4mVgwGnpp
5mGdpfjnYY9xrntWm3G5TDj
5xmCMy6Nbrp6wVX4SGj5tCS
6PSX4Hwq7TZNxt5YdRh6Yqm
6zHHWQ93dCzMRM3Rfvxz3Yc
7bQ3GbNY5LYvKnhqsyTpqVs
7hWN5fb98CxSqMH7vjVXBSJ
7rYcwtB6nqnGcrXZFRDZvFH
8GmrcPpVvV9Pzqr7jxsXxsy
8Jrzwwk7MybzwGjQSVGDZRH
8VHV6qGd8YbcX7MFkGJfQKb
8tc8hMCgfp3wbFSXZgDjdWM
97Dtm9Y8CgQH3jVygWRqm6h
9DcCXGGDR4HVCcxkcz3rpKt
9NcsMTBD4dSdgD48krqSS3S
9PJV4wMGsdVPZF9z8S9WxDn
9czPVtXvyVDTqtkJXvRtYtM
9kxF7SCvZVDsZQG4wLSrFTL
BC3Nyt6dkCmDZyRGZ6ygD4f
BDK5twJFSh9vQHHqQtw7GGz
BFRJqthXJdk6Gn8RDLYkbLd
BLhYCnrWHPZvR9gnFsJ6pwB
BpPVcJVdWJ74rqDDwXDmMzt
C6mQBy6rj57xFSYWLTmntxt
CNTd4znkW3KK5NHytZDnL3m
CwbcwB4MKCKNkfpdtb9p59F
DTqK4XxnkPzLVLD4jDNqF5J
DhtktBFJN4qQCvwtLsd6QYn
FcK5ZZZdvGtpFBbRphQrg4K
FnHVzMPxhNcJBDstXKntdmk
GCvkZLH754qBsVr5SzkYh9p
GLdBqZsjfbyyGnjkqQGX55z
GxKzFccStWdWLKBCRLrYn4p
HDK7tr4R6hLmVdKDSsbLJBx
JH7t44mzMJqP33GNZJ96TMw
JMFVjBCjcxd683PggVkjNLg
JvFxqXMWkhDC7SpHNCQHBRr
JxS6h8GDvKmxdXFNm9YnVxJ
KKF3CyxfkFGZZRF8CxK54Xh
KrhfLJrb8kkNXybjd6JjKRk
LMqpRZ8K67dYKZdcRVWZykf
MHjH4ngRFTKzHjRND3CTk3w
MZKP6XRwK48F6Z3p6HqdNxS
Mdy6MLCMRkxmcpkGtqVFqkK
NJLvq3fJPPpPfpmJdBYj5rp
NQcjZMxDh7Vxs5HD7c7FF5G
NbCJ4yKLYFXkchSQpzTWWqF
NvWYyKrqzhDDqSmxYpvGdpJ
P9b3pjdRsfm84FgFD69m6Ft
Q9R4hZJtyVyVkLFTK6kfrLr
Q9VjFhSK6P8pT7DVZZht7Nr
RMWNGWTknXvFnZtKjNXpgSH
RQFJrxG7mH3gPcwcn6k8zPd
RfBxyP7V9SCV8h97Mkw5cxJ
RvBvyRKZJDLndqQTpvBN6zp
TNN4KKMFCJcmj6zzLS3DX5K
TPdVBC49c777MX3jKgN5Rtw
Tdf73c435dtm7KnmfDXF3fv
Vkj8fG4JrbMDhhzHqLBc8Wd
Vmtt46vBQqKPRSGjWpWpSyh
XMvsvrxHcmHfDd8P34dBWkd
XXTTQGFXTVwZxWqJsywnDqv
XkFYyFvWBxkTptnz5PcwgtP
YtQZ7QwXx75PpCC6Vm6d5hk
Z9QV8vRjrDNhsLzNbsnsB9F
ZKHqMTVHQMK3vnRvpBpz7Dd
ZXnhJjWfqfTqQDKz6SfyWVw
ZfjJsBZjGn5mzSyG4XBFJmf
ZjvnLXkDMqrSwSYm4gZ9Vsf
bQb4V9kFT33fWfW8DhGD5hr
bZHp4cmfcVQQcMXQt4TZvSF
bzG8TWvqVg6pX7cybRTqtjF
cTrTGkSBzm7R5wv6J79vHPM
cmhhfBNb6tDzwFMdBxpxVKc
ctS7TY8Xpy4wNdhRBptJgkN
dNKhHJDvhRLJVgvr89w8tns
dd3wQfZGGrYfs5Gbz65XrBj
gbD3XhfwS964ympR6J3Bz8B
gsbrH8VmVhD8czsT6cYTtzF
gwCVNbK3Kq3tDLwmB9XmSfC
j4CCGgfssZqpGs9QMgTjB4C
jBHKgmrnNYsYYPxLpPKFKRL
kRYLkZfmQtkqNgL38hyNPfn
m3bGWbNRQt7fhDHCpX7gfMt
mVHYnBmjPm8RJTbKKxkgzWk
mhxky4BsSnxRV7Sv3YpRRWw
mjJDzFYsSxDm4vTdNdYgdTJ
mmP43FZ77DLpMNgqtTqsM6g
mqSWKNzwyN774m3pzMTvJpr
ns4yyBcmyLpYWV6NDhG4c4D
p3VXzgCbrknLQVs86xSTTfH
pHQt8QS5zX7CcPqD7Zmd3hc
rPBKwM4KN7vbLpr5W9Ld8QC
rfKvCbwRQgsvwfzm7MXLgVk
rhQNP9KbGz3TYkk9cLJgpjB
sLS3mC3qPCDfCWsxbs6ZWNM
sjFdfdrbkQ9ydjzxx9B63DN
syj3QgyWwGjgnzx9L7fqYSv
tBSxr3Y5ymhR6bYRmg3gzSS
tZbxRyy4m3dmgDDLXwr55TD
vcxDpfB9m8BgGbNntY9yPgp
vm4LpbTcPST9yFdr5qqjGRL
w3r3GNCDXD9DVZgLqyYJjkj
wH8tYKmW85vvZjVrvpc6cDL
wRk3fQCswFWyBV5vqxJgw4q
wWzZnVDBsPxzKt7Jq7W5QRd
xMmx8T3fLJwYHvVRgwqfDHD
xzMKHXVfRSZYt8GT4ndTQYJ
ycHQVjDBJXC3873CyCyjysK
zJ8XQMxf8bfQWFV5x6Xm5Xf
zYqXcW4dHBZ3B5j3zhRf8nN
robertatakenaka commented 3 years ago
pids = {
    "3BTLDWzmHmHfqGwtZjRFhNS": "S1517-31512012000400004",
    "3Rq5B7HzHhcV9XPgHJhRsZF": "S1517-31512013000100010",
    "4T4Fty7pkvxz4GDcgrc3FxQ": "S1517-31512014000300010",
    "4TpKw6gSsVnLjn4j3qCVcmh": "S1517-31512014000100001",
    "4ggsxNmqdGWMM7bqyLfhN4J": "S1517-31512012000400001",
    "4q9wmwtZfMr73fcPRmCXQDk": "S1517-31512014000200005",
    "54gKs9dnzcZkTnZhGdGthpm": "S1517-31512014000400001",
    "55RhWmGGSzvQ6s4mVgwGnpp": "S1517-31512013000100008",
    "5mGdpfjnYY9xrntWm3G5TDj": "S1517-31512014000400009",
    "5xmCMy6Nbrp6wVX4SGj5tCS": "S1517-31512012000300007",
    "6PSX4Hwq7TZNxt5YdRh6Yqm": "S1517-31512014000400011",
    "6zHHWQ93dCzMRM3Rfvxz3Yc": "S1517-31512014000100002",
    "7bQ3GbNY5LYvKnhqsyTpqVs": "S1517-31512013000400003",
    "7hWN5fb98CxSqMH7vjVXBSJ": "S1517-31512013000400004",
    "7rYcwtB6nqnGcrXZFRDZvFH": "S1517-31512014000100008",
    "8GmrcPpVvV9Pzqr7jxsXxsy": "S1517-31512014000100003",
    "8Jrzwwk7MybzwGjQSVGDZRH": "S1517-31512014000400006",
    "8VHV6qGd8YbcX7MFkGJfQKb": "S1517-31512014000100004",
    "8tc8hMCgfp3wbFSXZgDjdWM": "S1517-31512013000200001",
    "97Dtm9Y8CgQH3jVygWRqm6h": "S1517-31512014000100006",
    "9DcCXGGDR4HVCcxkcz3rpKt": "S1517-31512014000400003",
    "9NcsMTBD4dSdgD48krqSS3S": "S1517-31512013000400007",
    "9PJV4wMGsdVPZF9z8S9WxDn": "S1517-31512014000100007",
    "9czPVtXvyVDTqtkJXvRtYtM": "S1517-31512012000400010",
    "9kxF7SCvZVDsZQG4wLSrFTL": "S1517-31512012000400011",
    "BC3Nyt6dkCmDZyRGZ6ygD4f": "S1517-31512012000400006",
    "BDK5twJFSh9vQHHqQtw7GGz": "S1517-31512014000400002",
    "BFRJqthXJdk6Gn8RDLYkbLd": "S1517-31512013000200010",
    "BLhYCnrWHPZvR9gnFsJ6pwB": "S1517-31512014000200003",
    "BpPVcJVdWJ74rqDDwXDmMzt": "S1517-31512014000400005",
    "C6mQBy6rj57xFSYWLTmntxt": "S1517-31512013000300006",
    "CNTd4znkW3KK5NHytZDnL3m": "S1517-31512014000400008",
    "CwbcwB4MKCKNkfpdtb9p59F": "S1517-31512012000300001",
    "DTqK4XxnkPzLVLD4jDNqF5J": "S1517-31512013000100007",
    "DhtktBFJN4qQCvwtLsd6QYn": "S1517-31512012000400008",
    "FcK5ZZZdvGtpFBbRphQrg4K": "S1517-31512012000400002",
    "FnHVzMPxhNcJBDstXKntdmk": "S1517-31512012000400003",
    "GCvkZLH754qBsVr5SzkYh9p": "S1517-31512013000200009",
    "GLdBqZsjfbyyGnjkqQGX55z": "S1517-31512013000300005",
    "GxKzFccStWdWLKBCRLrYn4p": "S1517-31512014000300011",
    "HDK7tr4R6hLmVdKDSsbLJBx": "S1517-31512012000300003",
    "JH7t44mzMJqP33GNZJ96TMw": "S1517-31512013000200006",
    "JMFVjBCjcxd683PggVkjNLg": "S1517-31512013000100002",
    "JvFxqXMWkhDC7SpHNCQHBRr": "S1517-31512014000200009",
    "JxS6h8GDvKmxdXFNm9YnVxJ": "S1517-31512014000400004",
    "KKF3CyxfkFGZZRF8CxK54Xh": "S1517-31512012000300002",
    "KrhfLJrb8kkNXybjd6JjKRk": "S1517-31512014000300004",
    "LMqpRZ8K67dYKZdcRVWZykf": "S1517-31512013000300008",
    "MHjH4ngRFTKzHjRND3CTk3w": "S1517-31512013000100004",
    "MZKP6XRwK48F6Z3p6HqdNxS": "S1517-31512014000300008",
    "Mdy6MLCMRkxmcpkGtqVFqkK": "S1517-31512014000300003",
    "NJLvq3fJPPpPfpmJdBYj5rp": "S1517-31512013000300009",
    "NQcjZMxDh7Vxs5HD7c7FF5G": "S1517-31512014000100011",
    "NbCJ4yKLYFXkchSQpzTWWqF": "S1517-31512014000200007",
    "NvWYyKrqzhDDqSmxYpvGdpJ": "S1517-31512012000400005",
    "P9b3pjdRsfm84FgFD69m6Ft": "S1517-31512013000200008",
    "Q9R4hZJtyVyVkLFTK6kfrLr": "S1517-31512013000100001",
    "Q9VjFhSK6P8pT7DVZZht7Nr": "S1517-31512014000300006",
    "RMWNGWTknXvFnZtKjNXpgSH": "S1517-31512012000200001",
    "RQFJrxG7mH3gPcwcn6k8zPd": "S1517-31512013000400001",
    "RfBxyP7V9SCV8h97Mkw5cxJ": "S1517-31512014000200010",
    "RvBvyRKZJDLndqQTpvBN6zp": "S1517-31512012000400007",
    "TNN4KKMFCJcmj6zzLS3DX5K": "S1517-31512014000200011",
    "TPdVBC49c777MX3jKgN5Rtw": "S1517-31512013000300004",
    "Tdf73c435dtm7KnmfDXF3fv": "S1517-31512012000200006",
    "Vkj8fG4JrbMDhhzHqLBc8Wd": "S1517-31512012000300006",
    "Vmtt46vBQqKPRSGjWpWpSyh": "S1517-31512013000400011",
    "XMvsvrxHcmHfDd8P34dBWkd": "S1517-31512014000400010",
    "XXTTQGFXTVwZxWqJsywnDqv": "S1517-31512013000200005",
    "XkFYyFvWBxkTptnz5PcwgtP": "S1517-31512014000300001",
    "YtQZ7QwXx75PpCC6Vm6d5hk": "S1517-31512014000200001",
    "Z9QV8vRjrDNhsLzNbsnsB9F": "S1517-31512013000400005",
    "ZKHqMTVHQMK3vnRvpBpz7Dd": "S1517-31512013000400002",
    "ZXnhJjWfqfTqQDKz6SfyWVw": "S1517-31512012000300005",
    "ZfjJsBZjGn5mzSyG4XBFJmf": "S1517-31512013000300011",
    "ZjvnLXkDMqrSwSYm4gZ9Vsf": "S1517-31512014000300007",
    "bQb4V9kFT33fWfW8DhGD5hr": "S1517-31512014000100005",
    "bZHp4cmfcVQQcMXQt4TZvSF": "S1517-31512014000100009",
    "bzG8TWvqVg6pX7cybRTqtjF": "S1517-31512012000200003",
    "cTrTGkSBzm7R5wv6J79vHPM": "S1517-31512013000400010",
    "cmhhfBNb6tDzwFMdBxpxVKc": "S1517-31512012000300009",
    "ctS7TY8Xpy4wNdhRBptJgkN": "S1517-31512013000300007",
    "dNKhHJDvhRLJVgvr89w8tns": "S1517-31512012000200007",
    "dd3wQfZGGrYfs5Gbz65XrBj": "S1517-31512013000200011",
    "gbD3XhfwS964ympR6J3Bz8B": "S1517-31512014000400007",
    "gsbrH8VmVhD8czsT6cYTtzF": "S1517-31512013000200007",
    "gwCVNbK3Kq3tDLwmB9XmSfC": "S1517-31512013000100005",
    "j4CCGgfssZqpGs9QMgTjB4C": "S1517-31512014000200002",
    "jBHKgmrnNYsYYPxLpPKFKRL": "S1517-31512013000300003",
    "kRYLkZfmQtkqNgL38hyNPfn": "S1517-31512013000400009",
    "m3bGWbNRQt7fhDHCpX7gfMt": "S1517-31512013000400006",
    "mVHYnBmjPm8RJTbKKxkgzWk": "S1517-31512013000300010",
    "mhxky4BsSnxRV7Sv3YpRRWw": "S1517-31512013000200003",
    "mjJDzFYsSxDm4vTdNdYgdTJ": "S1517-31512014000300002",
    "mmP43FZ77DLpMNgqtTqsM6g": "S1517-31512012000200005",
    "mqSWKNzwyN774m3pzMTvJpr": "S1517-31512012000400009",
    "ns4yyBcmyLpYWV6NDhG4c4D": "S1517-31512013000100011",
    "p3VXzgCbrknLQVs86xSTTfH": "S1517-31512012000200004",
    "pHQt8QS5zX7CcPqD7Zmd3hc": "S1517-31512013000100006",
    "rPBKwM4KN7vbLpr5W9Ld8QC": "S1517-31512014000100010",
    "rfKvCbwRQgsvwfzm7MXLgVk": "S1517-31512014000200008",
    "rhQNP9KbGz3TYkk9cLJgpjB": "S1517-31512014000200004",
    "sLS3mC3qPCDfCWsxbs6ZWNM": "S1517-31512012000300008",
    "sjFdfdrbkQ9ydjzxx9B63DN": "S1517-31512013000100009",
    "syj3QgyWwGjgnzx9L7fqYSv": "S1517-31512014000200006",
    "tBSxr3Y5ymhR6bYRmg3gzSS": "S1517-31512012000300010",
    "tZbxRyy4m3dmgDDLXwr55TD": "S1517-31512014000300009",
    "vcxDpfB9m8BgGbNntY9yPgp": "S1517-31512014000300005",
    "vm4LpbTcPST9yFdr5qqjGRL": "S1517-31512012000300004",
    "w3r3GNCDXD9DVZgLqyYJjkj": "S1517-31512013000300002",
    "wH8tYKmW85vvZjVrvpc6cDL": "S1517-31512013000300001",
    "wRk3fQCswFWyBV5vqxJgw4q": "S1517-31512012000200002",
    "wWzZnVDBsPxzKt7Jq7W5QRd": "S1517-31512012000300011",
    "xMmx8T3fLJwYHvVRgwqfDHD": "S1517-31512013000200004",
    "xzMKHXVfRSZYt8GT4ndTQYJ": "S1517-31512013000100003",
    "ycHQVjDBJXC3873CyCyjysK": "S1517-31512012000200008",
    "zJ8XQMxf8bfQWFV5x6Xm5Xf": "S1517-31512013000200002",
    "zYqXcW4dHBZ3B5j3zhRf8nN": "S1517-31512013000400008",
}
robertatakenaka commented 3 years ago

O problema ocorreu em https://www.scielo.br/scielo.php?script=sci_serial&pid=1517-3151&lng=pt&nrm=1, que é um título que não possui especificado nenhum ISSN, nem impresso nem eletrônico. O valor 1517-315 se refere ao ISSN impresso.

robertatakenaka commented 3 years ago

Mais detalhes da solução: https://docs.google.com/document/d/1DrRPArvm8pzpuwRzbZQV3H7q3YoQ9dPH10BhfK89Uko/edit#heading=h.yrp07bkzy4nw