daitss / core

DAITSS: Dark Archive In The Sunshine State
GNU General Public License v3.0
9 stars 2 forks source link

PRODUCTION D2: Submit-direct throwing EOF error #430

Closed childree closed 12 years ago

childree commented 13 years ago

I can't find a pre-existing ticket for this. I'm having an issue with submit-direct. Any ideas?

$ less /var/log/daitss/submit-direct/20110518_TRAC167.log
Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m
Error submitting /var/daitss/exceptions/tickets/167/CF00000037_0: end of file reached
Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m
Error submitting /var/daitss/exceptions/tickets/167/CF00000040_0: end of file reached
Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m
Error submitting /var/daitss/exceptions/tickets/167/CF00000004_2: end of file reached
Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m
Error submitting /var/daitss/exceptions/tickets/167/CF00000048_1: end of file reached
/var/log/daitss/submit-direct/20110518_TRAC167.log (END) 
lydiam commented 13 years ago

Note: these packages archived successfully during D2 testing. See notes in https://spreadsheets.google.com/ccc?key=tdVST-N_VC9zUCuS6GemlvA#gid=2. They would have been submitted using submit-direct. Also: I've confirmed that the D2 Ops Manual states that command-line submission should be done via the "submit" command: http://wiki.fcla.edu/wiki/index.php/DL:FDA_DAITSS2_Operations_Manual#Command_line_submission_.28the_submit_command.29.

avatar382 commented 13 years ago

I noticed that all of these packages had wrong permissions on the descriptor:

for example:

[daitss@fclnx30 CF00000048_1]$ ll
total 2456
-rw-r--r-- 1 fcljac fcljac 1880553 Nov 10  2008 48no2.pdf
-rw-r----- 1 fcljac fcljac    2797 Nov 17  2008 CF00000048_1.xml
drwxr-xr-x 2 fcljac fcljac    4096 Nov 10  2008 common_files/
-rw-r--r-- 1 fcljac fcljac  296444 Nov 10  2008 fhq_48_2.txt
-rw-r--r-- 1 fcljac fcljac  287650 Nov 10  2008 SN00154113_0048_002.sgm

Lets try setting the permissions on the descriptor to be readable by the daitss user, that should fix this issue.

childree commented 13 years ago

I think I'm getting this again and I changed the permissions to 775 [/var/daitss/incoming/ftpdl/UFE0042419]:

Tue Jun 21 09:14:54 -0400 2011 -- UFE0042390 -- submitted successfully: EDE9MZZ9L_O28FKY
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m
/opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:in `load': end of file reached (EOFError)
        from /opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:in `validate_xml'
        from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:67:in `validate!'
        from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:51:in `valid?'
        from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:85:in `submit'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:in `transaction'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in `commit'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:195:in `within'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in `commit'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:in `transaction'
        from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:79:in `submit'
        from /opt/web-services/sites/core/current/bin/submit-direct:75:in `submit_package'
        from /opt/web-services/sites/core/current/bin/submit-direct:121
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27.
Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m
/opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:in `load': end of file reached (EOFError)
        from /opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:in `validate_xml'
        from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:67:in `validate!'
        from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:51:in `valid?'
        from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:85:in `submit'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:in `transaction'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in `commit'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:195:in `within'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in `commit'
        from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:in `transaction'
        from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:79:in `submit'
        from /opt/web-services/sites/core/current/bin/submit-direct:75:in `submit_package'
        from /opt/web-services/sites/core/current/bin/submit-direct:121
(END) 

When trying to validate the descriptor, it appears that there is an invalid character within the descriptor being escaped as "&#x16":

[UFE0042419]$ validate UFE0042419.xml
java -Dwebcache=/var/daitss/webcache/ -Dfile=UFE0042419.xml -jar /opt/daitss/lib/xmlvalidator.jar
Exception in thread "main" org.xml.sax.SAXParseException: Character reference "&#x16" is an invalid XML character.
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:264)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:292)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:172)
        at edu.fcla.da.xml.Validator.validate(Unknown Source)
        at edu.fcla.da.xml.Validator.main(Unknown Source)
[UFE0042419]$ grep "&#x16" UFE0042419.xml
In this research, we first successfully record action potentials via the UF system adopting the IF neuron circuit in the in-vivo recording. To conduct an in-vivo recording, the implanted electrode must be well placed to detect available neural signals, and the analog and digital parts of the UF system need parameter optimization and calibration. In addition, the dual system experiment, comprising the UF system and the TDT system, verifies that the UF recording system extracts quality in-vivo signals. In an in-vivo recording, the spike sorting results for these recording systems classify the same neural signals. The UF system can record 1000 Vpp high action potential signals but induces

Should I start a new ticket for this issue?

avatar382 commented 13 years ago

Weird, I was able to submit the package:

[manny@fclnx30]/var/daitss/incoming/ftpdl% submit --package UFE0042390 --username manny --password *** Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m Tue Jun 21 14:10:26 -0400 2011 -- UFE0042390 -- submitted successfully: EB6T6957N_BY8023

I did open the descriptor on darchive with vi, add a newline, and then overwrote it. I think the problem was invalid chars in the descriptor, which was fixed by saving it with vi.

On Jun 21, 2011, at 12:01 PM, childree wrote:

I think I'm getting this again and I changed the permissions to 775:

Tue Jun 21 09:14:54 -0400 2011 -- UFE0042390 -- submitted successfully: EDE9MZZ9L_O28FKY Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m /opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:in load': end of file reached (EOFError) from /opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:invalidate_xml' from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:67:in validate!' from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:51:invalid?' from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:85:in submit' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:intransaction' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in commit' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:195:inwithin' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in commit' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:intransaction' from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:79:in submit' from /opt/web-services/sites/core/current/bin/submit-direct:75:insubmit_package' from /opt/web-services/sites/core/current/bin/submit-direct:121 Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Fatal error: xmlParseCharRef: invalid xmlChar value 22 at :27. Picked up _JAVA_OPTIONS: -Xms16m -Xmx4096m /opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:in load': end of file reached (EOFError) from /opt/web-services/sites/core/current/lib/daitss/proc/xmlvalidation.rb:87:invalidate_xml' from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:67:in validate!' from /opt/web-services/sites/core/current/lib/daitss/proc/sip_archive.rb:51:invalid?' from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:85:in submit' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:intransaction' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in commit' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:195:inwithin' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:131:in commit' from /opt/web-services/sites/core/shared/bundle/ruby/1.8/gems/dm-transactions-1.0.2/lib/dm-transactions.rb:373:intransaction' from /opt/web-services/sites/core/current/lib/daitss/archive/submit.rb:79:in submit' from /opt/web-services/sites/core/current/bin/submit-direct:75:insubmit_package' from /opt/web-services/sites/core/current/bin/submit-direct:121 (END)

Reply to this email directly or view it on GitHub: https://github.com/daitss/core/issues/430#issuecomment-1411431

avatar382 commented 13 years ago

something about this package's invalid descriptor causes it to break submit, even if you submit it via the gui, you get a 500 error and the same stack trace.

childree commented 13 years ago

I've talked to Caitlin and she said she also had an issue with this character. She said to just remove the "&#x16" and put in "[mu]", the character is mu. This is what I use to do with the nematode journals when special characters came up. I'll modify the descriptor and try submitting it again.

-The validation now comes back clean on this descriptor. -This package is now submitted.

For the time being, we now know that special characters within the descriptor can cause submit to behave badly.

avatar382 commented 13 years ago

Mu the greek character the or japanese character?

On Jun 21, 2011, at 2:56 PM, childree wrote:

I've talked to Caitlin and she said she also had an issue with this character. She said to just remove the "&#x16" and put in "[mu]", the character is mu. This is what I use to do with the nematode journals when special characters came up. I'll modify the descriptor and try submitting it again.

-The validation now comes back clean on this descriptor. -This package is now submitted.

For the time being, we now know that special characters within the descriptor can cause submit to behave badly.

Reply to this email directly or view it on GitHub: https://github.com/daitss/core/issues/430#issuecomment-1412847

childree commented 13 years ago

I'd say Greek.

avatar382 commented 12 years ago

The cause was a special character in the descriptor. The workaround was to modify the descriptor. It's probably the case that the character set was incorrect in the descriptor. Lets pull this package and see if the character was encoded correctly.

lydiam commented 12 years ago

The encoding on this descriptor (the archived version) is encoding="ISO-8859-1" and doesn't support greek characters. &#x16 appears to be a greek delta in UTF-8 encoding but from the grep example above it looks like that wasn't even escaped so it was literally an ampersand #x16. Do we have the "rejected" version of this descriptor or a similar one? I'd like to see if it validates when the encoding is changed. I'd like to better understand which character encoding can be used with XML and whether daitss will accept proper UTF-8 encoding. I'm concerned that we not reject properly encoded characters in descriptive metadata.

childree commented 12 years ago

For the rejected descriptor, curl the package and edit the [mu] back to &#x16 then test it out on ripple. It should give you the same results.

lydiam commented 12 years ago

On 4/10/2012 2:53 PM, Jennifer wrote:

For the rejected descriptor, curl the package and edit the [mu] back to&#x16 then test it out on ripple. It should give you the same results.


Reply to this email directly or view it on GitHub: https://github.com/daitss/core/issues/430#issuecomment-5053415

Was the text literally  or was it escaped in some fashion? I downloaded the descriptor but I don't even see any of the text that you grepped for.

childree commented 12 years ago

In the SIP descriptor:

<dc:description>This research tests...In this research, we first successfully record action potentials via the UF system adopting the IF neuron circuit in the in-vivo recording. To conduct an in-vivo...system can record 1000 [mu]Vpp high...

The text "[mu]" was literally the string of characters as follows minus the quotes "&#x16". The "&#x16" is supposed to represent "μVpp" and somewhere prior to our receiving it, it was escaped as "&#x16Vpp".

childree commented 12 years ago

Perhaps this might shed light on exactly what &#x16 is escaping: http://www.fileformat.info/info/unicode/char/16/index.htm

The character should actually be mu, it is obviated by the context of the document. Here is some info on mu: http://www.fileformat.info/info/unicode/char/3bc/index.htm

lydiam commented 12 years ago

My current question is: is there any way to correctly encode the "mu" character or a variety of other characters so that descriptors validate. dmdSecs can contain a number of different characters that would be unacceptable in directory or file names (for example, ampersand or brackets). It looks like our "validate" parser won't accept encoding=utf-16. I may take this offline, because clearly in this case "&x16;" is nothing like the correct representation of "mu" and is some sort of odd mistake.

lydiam commented 12 years ago

The answer is yes: you must use encoding="UTF-8" and the numeric representation of the character "mu", which is μ