pagnani / ArDCA.jl

Autoregressive networks for protein
MIT License
31 stars 8 forks source link

Missing permute in `(A::ArNet)(x::Matrix{T})`? #18

Closed christophfeinauer closed 2 years ago

christophfeinauer commented 2 years ago

In the following function

function (A::ArNet)(x::Matrix{T}) where T <: Integer 
    @extract A : J H p0 idxperm
    backorder = sortperm(idxperm)
    N, M = size(x)
    length(H) == N - 1 || throw(DimensionMismatch("incompatible size between input and fields"))
    q = length(p0)
    output = Array{eltype(p0),2}(undef,N,M)
    #Threads.@threads
    for i in 1:M     
        output[:,i] .= _outputarnet(view(x,:,i), J, H, p0, N, q)
    end
    permuterow!(output,backorder)
end

the input matrix x is backpermuted (permuterow!(output,backorder)) after calculating output, but never permuted before calculating output. Is that a bug or wanted behavior? If the latter it is slightly confusing, since the vector version do permute the input before calculating output. If it's a bug I can make a PR.

pagnani commented 2 years ago

I think it is ok. It is indeed permuted in the construction when permorder != from :NATURAL is chosen. Did you find an error?

christophfeinauer commented 2 years ago

Well if you want to evaluate a trained model on a new dataset, then

arnet, arvar = ardca(some_fasta)
Z = read_fasta_alignment(some_other_fasta, 1.0)
arnet(Z)

gives wrong results, since the sequences are not permuted if given as part of a matrix Z.

The code

arnet, arvar = ardca(some_fasta)
Z = read_fasta_alignment(some_other_fasta, 1.0)
hcat([arnet(Z[:,m]) for m in 1:size(Z,2)]...)

gives correct results, since the vector version of arnet permutes the sequence Z[:,m].

If that is wanted behavior, we can close this issue.

pagnani commented 2 years ago

Then I have to think about it ... but if you are willing to do a PR that would be great!

A

On Wed, Mar 9, 2022 at 5:06 PM Christoph Feinauer @.***> wrote:

Well if you want to evaluate a trained model on a new dataset, then

arnet, arvar = ardca(some_fasta) Z = read_fasta_alignment(some_other_fasta, 1.0) arnet(Z)

gives wrong results, since the sequences are not permuted if given as part of a matrix Z.

The code

arnet, arvar = ardca(some_fasta) Z = read_fasta_alignment(some_other_fasta, 1.0) hcat([arnet(Z[:,m]) for m in 1:size(Z,2)]...)

gives correct results, since the vector version of arnet permutes the sequence Z[:,m].

If that is wanted behavior, we can close this issue.

— Reply to this email directly, view it on GitHub https://github.com/pagnani/ArDCA.jl/issues/18#issuecomment-1063088737, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJYTHVXCMLA2GVUGLUMOIDU7DD7VANCNFSM5QJYHCJQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: <pagnani/ArDCA. @.***>

--

Politecnico Torino Corso Duca degli Abruzzi 24, I-10129 Torino - Italy +39 011 0907417 web: http://www.polito.it/cmp/home/pagnani

Italian Institute for Genomic Medicine (IIGM) IRCCS di Candiolo, SP142 km 3,95 10060 Candiolo (TO)

pagnani commented 2 years ago

I finally understood the problem. Of course you are right there was an inconsistency

corrected by commit: 9112dc6

pagnani commented 2 years ago

I wonder if I should tag a new version

pagnani commented 2 years ago

Closing